llama.cpp

Commit Graph

Author	SHA1	Message	Date
Georgi Gerganov	516d88e75c	readme : add GPT4All instructions (close #588 )	1 year ago
Georgi Gerganov	53635c081c	py : add GPT4All conversion script For now: copy-paste Too much time for me to deduplicate the python code	1 year ago
Maël Kerbiriou	41318d708e	llama : use the same threshold for OpenBLAS and ggml thread limiting (#577 )	1 year ago
Tobias Lütke	a6956b25a1	add example of re-act pattern (#583 ) * add example of re-act pattern * spelling... * fixed whitespace in reverse prompt issue	1 year ago
anzz1	83df5639eb	Fix GCC warning about binary literal (#595 ) 0b10101010 -> 0xAA /* 0b10101010 */	1 year ago
anzz1	a5c42c4b13	Fix typo in llama.h (#593 )	1 year ago
anzz1	5a5f8b1501	Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375 ) * Enable Fused-Multiply-Add (FMA) instructions on MSVC __FMA__ macro does not exist in MSVC * Enable F16C/CVT16 vector extensions on MSVC __F16C__ macro does not exist in MSVC, but is implied with AVX2/AVX512 * MSVC cvt intrinsics * Add __SSE3__ macro for MSVC too because why not even though it's not currently used for anything when AVX is defined	1 year ago
anzz1	f1217055ea	CI: fix subdirectory path globbing (#546 ) - Changes in subdirectories will now be detecter properly - (Windows-MSVC) AVX512 tests temporarily disabled	1 year ago
anzz1	7f4c5c6651	llama : fix linkage with mingw (#551 ) * Revert `7e53955` (#542) Still needs to be fixed properly * Fix linking on mingw32	1 year ago
slaren	2a98bc18ea	ggml : add AVX2 implementation of quantize_row_q4_1 (#515 ) * Add AVX2 implementation of quantize_row_q4_1 * Actually use AVX2 * Make quantize_row_q4_1 static Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
thement	d0aaff571c	py : add temporary script to convert old ggml files to newer version (#539 ) Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>	1 year ago
Tai Duc Nguyen	d0330fd783	py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403 )	1 year ago
Stephan Walter	99c5b27654	ggml : refactor quantized processing functions (#509 ) * Refactor quantized processing functions * ggml : minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
DooWoong Lee (David)	692ce3164e	py : removed unused `model` variable and verified that the code functions correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547 )	1 year ago
Georgi Gerganov	96f9c0506f	ci : make ctest verbose, hopefully we see what is wrong with the sanitizer	1 year ago
Georgi Gerganov	d502bc7c9d	tests : free llama context at the end of the test	1 year ago
Stephan Walter	436e561931	all : be more strict about converting float to double (#458 ) * Be more strict about converting float to double * Test equivalence of round, SILU implementations Test module is commented out in CMakeLists.txt because the tests may take a long time, depending on how much the compiler optimizes. * Fix softmax in perplexity.cpp * all : prefer float over double where appropriate * perplexity : add <cmath> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Jed Fox	20e1e84884	deploy : add a Package.swift for SwiftPM support (#393 ) * Add a Package.swift for SwiftPM support * Swap from exclusions to allowlist	1 year ago
Stephan Walter	c1f885067c	ggml : introduce structs for the q4 data blocks (#356 ) * Introduce structs for the q4 data blocks * ggml : rename quant struct variables + fix ARM_NEON --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Georgi Gerganov	e0670260fb	gitignore : add "embedding"	1 year ago
dotpy314	28ba975aea	Check the existence of f16_model_path_base in quantize.py (#574 ) Co-authored-by: Jincheng Miao <jincheng.miao@gmail.com>	1 year ago
slaren	a6bdc47cba	Fix usage of F16C intrinsics in AVX code (#563 ) * Fix usage of F16C intrinsics in AVX code when F16C is not defined	1 year ago
anzz1	7b8dbcb78b	main.cpp fixes, refactoring (#571 ) - main: entering empty line passes back control without new input in interactive/instruct modes - instruct mode: keep prompt fix - instruct mode: duplicate instruct prompt fix - refactor: move common console code from main->common	1 year ago
RJ Adriaansen	4b8efff0e3	Add embedding example to Makefile (#540 )	1 year ago
Marco Matthies	7e5395575a	Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542 )	1 year ago
Erik Scholz	34c1072e49	ci: add debug build to sanitizer build matrix (#527 )	1 year ago
Stephan Walter	939ad2d3a5	Fix undefined variables in debug build, remove unused variables (#531 )	1 year ago
Juan Calderon-Perez	8c2ec5e21d	Add support for linux/arm64 platform during Docker Builds (#514 ) * Add support for linux/arm64 platform * Add platform to versioned builds	1 year ago
Stephan Walter	b391579db9	Update README and comments for standalone perplexity tool (#525 )	1 year ago
anzz1	7a87d31f4f	[main] fix infinite generation (-n == -1) (#523 )	1 year ago
Georgi Gerganov	348d6926ee	Add logo to README.md	1 year ago
Harald Fernengel	33e35b8fe8	Exit from interactive mode if input stream is bad (#491 ) Allow exiting the interactive prompt also with CTRL-D on Unix and CTRL-Z on Windows.	1 year ago
anzz1	19726169b3	CI: Run other sanitizer builds even if one fails (#511 ) applies only to sanitizer builds so they wont be cancelled	1 year ago
jp-x-g	f732695cd5	Clarify console output in convert-pth-to-ggml.py (#512 ) "Processing part 1 of 3" instead of "Processing part 0"	1 year ago
anzz1	2f7bf7dd7c	CMake / CI additions (#497 ) * CMake: Add AVX512 option * CI: Add AVX/AVX512 builds (Windows) (AVX512 tests can only be run when the worker happens to support it, building works anyway) * CMake: Fix sanitizer linkage ( merged #468 ) * CI: Add sanitizer builds (Ubuntu) * CI: Fix release tagging (change @zendesk/action-create-release to @anzz1/action-create-release until upstream PR Added commitish as input zendesk/action-create-release#32 is merged)	1 year ago
anzz1	34ab526843	(Windows) Set console to UTF-8 on init (#420 ) Sets console codepage to 65001 (CP_UTF8) on start for both input and output, should fix problems with UTF-8 characters.	1 year ago
Georgi Gerganov	c2b25b6912	Fix colors enabling on WIN32	1 year ago
Georgi Gerganov	79b2b266db	If n_predict == -1, generate forever	1 year ago
Georgi Gerganov	e2d490dafd	Inifinite generation via context swapping (#71 )	1 year ago
Georgi Gerganov	03f7e33560	Cleanup STL headers + fix embedding examples + minor stuff	1 year ago
Georgi Gerganov	55ad42af84	Move chat scripts into "./examples"	1 year ago
slaren	459e93cce0	Add AVX2 implementation of dequantize_row_q4_1 (#505 )	1 year ago
Georgi Gerganov	a316a425d0	Overhaul the examples structure - main -> examples - utils -> examples (renamed to "common") - quantize -> examples - separate tools for "perplexity" and "embedding" Hope I didn't break something !	1 year ago
Georgi Gerganov	ecbe466a36	Retire the ggml_mul_mat() branch for transposed src0 (#500 ) * Retire the ggml_mul_mat() for transposed src0 - It can always be made contiguous with ggml_cpy() - The code is now simplified - The results are deterministic in respect to num threads * SIMD-ify dequantize_row_q4_0() for ARM_NEON (#502) * Attempt to SIMD-ify dequantize_row_q4_0() for ARM_NEON * Fix dequantization - forgot to interleave the quants	1 year ago
Georgi Gerganov	502a400192	Disable prompt verbosity by default and add option to enable (#480 )	1 year ago
slaren	09aecbf628	Add AVX2 implementation of dequantize_row_q4_0 (#467 )	1 year ago
Georgi Gerganov	4640eff23d	Don't interefe with BLAS for large prompts by running only 1 thread	1 year ago
Georgi Gerganov	ab77d76312	Add longer DAN prompt for testing big batch numbers	1 year ago
slaren	29b7baab67	Add timings for the prompt evaluation (#478 )	1 year ago
Georgi Gerganov	4a7129acd2	Remove obsolete information from README	1 year ago

1 2 3 4 5 ...

293 Commits (62b3e81aaeafb282934de8b21de13b0104f12f8c) All Branches Search

293 Commits (62b3e81aaeafb282934de8b21de13b0104f12f8c)

All Branches