Commit Graph

361 Commits (2d3481c72125cd388258864c7ad8d7d36777bad7)
 

Author SHA1 Message Date
Slaren 276e5b7811 Unmap the file in llama_free 1 year ago
Slaren d68c5dc435 Make mmap_file static 1 year ago
Slaren 64bde3ffd4 Fix ggml_init_params in quantize 1 year ago
Slaren c03ae8dca1 Add mmap support for model files 1 year ago
Stephan Walter 3bcc129ba8
cmake : properly invoke CTest (#629) 1 year ago
Casey Primozic a4755cf288
Remove unused variable (#607)
* It seems some new warning were added recently that exposed this.  I wrote the code that included this unused variable originally and it is indeed not needed.
1 year ago
david raistrick 1f0414feec
make : fix darwin f16c flags check (#615)
...there was no check.  ported upstream from https://github.com/zanussbaum/gpt4all.cpp/pull/2 (I dont see any clean path for upstream patches)
1 year ago
Georgi Gerganov 77efdf5a50
ggml : fix NEON signs (close #620, #622) 1 year ago
slaren ed3c680bcd
Fix GGML_F32Cx8_STORE in AVX without F16C path (#619) 1 year ago
anzz1 9cbc404ba6
ci : re-enable AVX512 testing (Windows-MSVC) (#584)
* CI: Re-enable AVX512 testing (Windows-MSVC)

Now with 100% less base64 encoding

* plain __cpuid is enough here
1 year ago
Georgi Gerganov b51c717d5c
ggml : init time on first ggml_init() call 1 year ago
Georgi Gerganov 0ba76c1e73
llama : fix compile warnings when reading the vocab 1 year ago
Georgi Gerganov cea1c85948
ggml : add ARM_NEON dequantize_row_q4_1() 1 year ago
Georgi Gerganov f202ada131
ggml : add ARM_NEON quantize_row_q4_1() 1 year ago
Georgi Gerganov 3b44d30d9b
ggml : add ARM_NEON ggml_vec_dot_q4_1() 1 year ago
Pavol Rusnak 61cbfff5c9
rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600)
to match filenames of other converters
1 year ago
Thérence d9ad104440
Create chat-13B.bat (#592)
* Create chat-13B.bat

Same script than chat-13B.sh, but for windows users.
Tested and working on windows 10/11 v 22H2

* Apply suggestions from code review

---------

Co-authored-by: anzz1 <anzz1@live.com>
1 year ago
Georgi Gerganov b467702b87
readme : fix typos 1 year ago
Georgi Gerganov 516d88e75c
readme : add GPT4All instructions (close #588) 1 year ago
Georgi Gerganov 53635c081c
py : add GPT4All conversion script
For now: copy-paste
Too much time for me to deduplicate the python code
1 year ago
Maël Kerbiriou 41318d708e
llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) 1 year ago
Tobias Lütke a6956b25a1
add example of re-act pattern (#583)
* add example of re-act pattern

* spelling...

* fixed whitespace in reverse prompt issue
1 year ago
anzz1 83df5639eb
Fix GCC warning about binary literal (#595)
0b10101010 -> 0xAA /* 0b10101010 */
1 year ago
anzz1 a5c42c4b13
Fix typo in llama.h (#593) 1 year ago
anzz1 5a5f8b1501
Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375)
* Enable Fused-Multiply-Add (FMA) instructions on MSVC

__FMA__ macro does not exist in MSVC

* Enable F16C/CVT16 vector extensions on MSVC

__F16C__ macro does not exist in MSVC, but is implied with AVX2/AVX512

* MSVC cvt intrinsics

* Add __SSE3__ macro for MSVC too because why not

even though it's not currently used for anything when AVX is defined
1 year ago
anzz1 f1217055ea
CI: fix subdirectory path globbing (#546)
- Changes in subdirectories will now be detecter properly
- (Windows-MSVC) AVX512 tests temporarily disabled
1 year ago
anzz1 7f4c5c6651
llama : fix linkage with mingw (#551)
* Revert 7e53955 (#542)

Still needs to be fixed properly

* Fix linking on mingw32
1 year ago
slaren 2a98bc18ea
ggml : add AVX2 implementation of quantize_row_q4_1 (#515)
* Add AVX2 implementation of quantize_row_q4_1

* Actually use AVX2

* Make quantize_row_q4_1 static

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
thement d0aaff571c
py : add temporary script to convert old ggml files to newer version (#539)
Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
1 year ago
Tai Duc Nguyen d0330fd783
py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403) 1 year ago
Stephan Walter 99c5b27654
ggml : refactor quantized processing functions (#509)
* Refactor quantized processing functions

* ggml : minor

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
DooWoong Lee (David) 692ce3164e
py : removed unused `model` variable and verified that the code functions correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547) 1 year ago
Georgi Gerganov 96f9c0506f
ci : make ctest verbose, hopefully we see what is wrong with the sanitizer 1 year ago
Georgi Gerganov d502bc7c9d
tests : free llama context at the end of the test 1 year ago
Stephan Walter 436e561931
all : be more strict about converting float to double (#458)
* Be more strict about converting float to double

* Test equivalence of round, SILU implementations

Test module is commented out in CMakeLists.txt because the tests may
take a long time, depending on how much the compiler optimizes.

* Fix softmax in perplexity.cpp

* all : prefer float over double where appropriate

* perplexity : add <cmath>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Jed Fox 20e1e84884
deploy : add a Package.swift for SwiftPM support (#393)
* Add a Package.swift for SwiftPM support

* Swap from exclusions to allowlist
1 year ago
Stephan Walter c1f885067c
ggml : introduce structs for the q4 data blocks (#356)
* Introduce structs for the q4 data blocks

* ggml : rename quant struct variables + fix ARM_NEON

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Georgi Gerganov e0670260fb
gitignore : add "embedding" 1 year ago
dotpy314 28ba975aea
Check the existence of f16_model_path_base in quantize.py (#574)
Co-authored-by: Jincheng Miao <jincheng.miao@gmail.com>
1 year ago
slaren a6bdc47cba
Fix usage of F16C intrinsics in AVX code (#563)
* Fix usage of F16C intrinsics in AVX code when F16C is not defined
1 year ago
anzz1 7b8dbcb78b
main.cpp fixes, refactoring (#571)
- main: entering empty line passes back control without new input in interactive/instruct modes
- instruct mode: keep prompt fix
- instruct mode: duplicate instruct prompt fix
- refactor: move common console code from main->common
1 year ago
RJ Adriaansen 4b8efff0e3
Add embedding example to Makefile (#540) 1 year ago
Marco Matthies 7e5395575a
Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542) 1 year ago
Erik Scholz 34c1072e49
ci: add debug build to sanitizer build matrix (#527) 1 year ago
Stephan Walter 939ad2d3a5
Fix undefined variables in debug build, remove unused variables (#531) 1 year ago
Juan Calderon-Perez 8c2ec5e21d
Add support for linux/arm64 platform during Docker Builds (#514)
* Add support for linux/arm64 platform

* Add platform to versioned builds
1 year ago
Stephan Walter b391579db9
Update README and comments for standalone perplexity tool (#525) 1 year ago
anzz1 7a87d31f4f
[main] fix infinite generation (-n == -1) (#523) 1 year ago
Georgi Gerganov 348d6926ee
Add logo to README.md 1 year ago
Harald Fernengel 33e35b8fe8
Exit from interactive mode if input stream is bad (#491)
Allow exiting the interactive prompt also with CTRL-D on Unix and CTRL-Z
on Windows.
1 year ago