llama.cpp

Commit Graph

Author	SHA1	Message	Date
Georgi Gerganov	97ab2b2578	Add Misc section + update hot topics + minor fixes	2 years ago
Sebastián A	2f700a2738	Add windows to the CI (#98 )	2 years ago
Georgi Gerganov	c09a9cfb06	CMake build in Release by default (#75 )	2 years ago
Georgi Gerganov	7ec903d3c1	Update contribution section, hot topics, limitations, etc.	2 years ago
Georgi Gerganov	4497ad819c	Print system information	2 years ago
Sebastián A	ed6849cc07	Initial support for CMake (#75 )	2 years ago
Thomas Klausner	41be0a3b3d	Add NetBSD support. (#90 )	2 years ago
Pavol Rusnak	671d5cac15	Use fprintf for diagnostic output (#48 ) keep printf only for printing model output one can now use ./main ... 2>dev/null to suppress any diagnostic output	2 years ago
Georgi Gerganov	84d9015c4a	Use vdotq_s32 to improve performance (#67 ) * 10% performance boost on ARM * Back to original change	2 years ago
uint256_t	63fd76fbb0	Reduce model loading time (#43 ) * Use buffering * Use vector * Minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2 years ago
Val Kharitonov	2a20f48efa	Fix UTF-8 handling (including colors) (#79 )	2 years ago
Pavol Rusnak	d1f224712d	Add quantize script for batch quantization (#92 ) * Add quantize script for batch quantization * Indentation * README for new quantize.sh * Fix script name * Fix file list on Mac OS --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2 years ago
Georgi Gerganov	1808ee0500	Add initial contribution guidelines	2 years ago
Matvey Soloviev	a169bb889c	Gate signal support on being on a unixoid system. (#74 )	2 years ago
Matvey Soloviev	460c482540	Fix token count accounting	2 years ago
Georgi Gerganov	c80e2a8f2a	Revert "10% performance boost on ARM" This reverts commit `113a9e83eb`. There are some reports for illegal instruction. Moved this stuff to vdotq_s32 branch until resolve	2 years ago
Georgi Gerganov	54a0e66ea0	Check for vdotq_s32 availability	2 years ago
Georgi Gerganov	543c57e991	Ammend to previous commit - forgot to update non-QRDMX branch	2 years ago
Georgi Gerganov	113a9e83eb	10% performance boost on ARM	2 years ago
Matvey Soloviev	404fac0d62	Fix color getting reset before prompt output done (#65 ) (cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)	2 years ago
Georgi Gerganov	1a0a74300f	Update README.md	2 years ago
Matvey Soloviev	96ea727f47	Add interactive mode (#61 ) * Initial work on interactive mode. * Improve interactive mode. Make rev. prompt optional. * Update README to explain interactive mode. * Fix OS X build	2 years ago
Marc Köhlbrugge	9661954835	Fix typo in README (#45 )	2 years ago
Ben Garney	f385f8dee8	Allow using prompt files (#59 )	2 years ago
beiller	02f0c6fe7f	Add back top_k (#56 ) * Add back top_k * Update utils.cpp * Update utils.h --------- Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2 years ago
Sebastián A	eb062bb012	Windows fixes (#31 ) * Apply fixes suggested to build on windows Issue: https://github.com/ggerganov/llama.cpp/issues/22 * Remove unsupported VLAs * MSVC: Remove features that are only available on MSVC C++20. * Fix zero initialization of the other fields. * Change the use of vector for stack allocations.	2 years ago
Georgi Gerganov	7027a97837	Update README.md	2 years ago
Georgi Gerganov	2d555e5b42	Add CI (#60 )	2 years ago
Georgi Gerganov	7c9e54e55e	Revert "weights_only" arg - this causing more trouble than help	2 years ago
Oleksandr Nikitin	b9bd1d0141	python/pytorch compat notes (#44 )	2 years ago
beiller	129c7d1ea8	Add repetition penalty (#20 ) * Adding repeat penalization * Update utils.h * Update utils.cpp * Numeric fix Should probably still scale by temp even if penalized * Update comments, more proper application I see that numbers can go negative so a fix from a referenced commit * Minor formatting --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2 years ago
Georgi Gerganov	702fddf5c5	Clarify meaning of hacking	2 years ago
Georgi Gerganov	7d86e25bf6	README: add "Supported platforms" + update hot topics	2 years ago
deepdiffuser	a93120236f	use weights_only in conversion script (#32 ) this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries	2 years ago
Pavol Rusnak	6a9a67f0be	Add LICENSE (#21 )	2 years ago
Georgi Gerganov	da1a4ff01f	Update README.md	2 years ago
Juraj Bednar	6b2cb6302f	Fix a typo in model name (#16 )	2 years ago
Georgi Gerganov	4235e3d5b3	Update README.md	2 years ago
Georgi Gerganov	f1eaff4721	Add AVX2 support for x86 architectures thanks to @Const-me !	2 years ago
Georgi Gerganov	a9e58529ea	Fix un-initialized FP16 tables on x86 (#15 , #2 )	2 years ago
Georgi Gerganov	7d9ed7b25f	Bump memory buffer	2 years ago
Georgi Gerganov	0c6803321c	Update README.md	2 years ago
Georgi Gerganov	f60fa9e50a	.gitignore models/	2 years ago
Georgi Gerganov	7211862c94	Update Makefile var + add comment	2 years ago
Georgi Gerganov	a5c5ae2f54	Update README.md	2 years ago
Georgi Gerganov	ea977e85ec	Update README.md	2 years ago
Georgi Gerganov	007a8f6f45	Support all LLaMA models + change Q4_0 quantization storage	2 years ago
Simon Willison	5f2f970d51	Include Python dependencies in README (#6 )	2 years ago
Georgi Gerganov	73c6ed5e87	Update README.md	2 years ago
Georgi Gerganov	01eeed8fb1	Update README.md	2 years ago

... 3 4 5 6 7

311 Commits (d9a239c4104c888eafda672c1e42c9bbc5084cb8) All Branches Search

311 Commits (d9a239c4104c888eafda672c1e42c9bbc5084cb8)

All Branches