master
remove-vzip
readme
ci_cublas
fix-eval-bos
q4_3-range-fix
ik/rmse_quantization
q4_0-q4_2-range-fix
gg/rmse_quantization
quant-attn
mmap-pages-stats
flash-attn
mmap
q4_1_more_accel
q4_1_more_accel_kahan
q4_1_more_accel_loopsplit
tcp_server
dev
ci_cublas-31ff9e2
ci_cublas-44286d3
ci_cublas-45d94c8
master-018f227
master-01a297b
master-02c5b27
master-02d6988
master-03f7e33
master-04aaae1
master-04c6f5e
master-074bea2
master-084e2f0
master-09aecbf
master-0ad9646
master-0b2da20
master-0b366e7
master-0b5a935
master-0ba76c1
master-0c44427
master-0c56923
master-0d054e2
master-0e018fe
master-0e07e6a
master-0e6cbff
master-0f07cac
master-0f1b21c
master-106faaf
master-10f19c1
master-11d9023
master-12b5900
master-13b0c68
master-1481a9c
master-1623a6e
master-180b693
master-1972616
master-1bfc153
master-1d08882
master-1f0414f
master-2005469
master-20e1e84
master-20fbf2a
master-214b6a3
master-22213a1
master-2456837
master-2485d7a
master-25d7abb
master-2663d2c
master-29b7baa
master-2a2e63c
master-2a98bc1
master-2bb992f
master-2bdc096
master-2d099e5
master-2d3481c
master-2e17dfd
master-2e664f1
master-2ec8342
master-2edbdb0
master-2f7bf7d
master-2f7c8e0
master-305ba6f
master-305eb5a
master-31572d9
master-315a95a
master-3173a62
master-334637e
master-33e35b8
master-34ab526
master-34c1072
master-34d9f22
master-3525899
master-368d0c8
master-36b4f7e
master-36d0753
master-36d19a6
master-38de86a
master-3bcc129
master-3cd8dde
master-3d59769
master-3e5aa8a
master-3e6e70d
master-4122dff
master-4274722
master-436e561
master-437e778
master-459e93c
master-461ba9e
master-4640eff
master-47f61aa
master-481044d
master-483bab2
master-4870e45
master-489537e
master-4953e90
master-4b8efff
master-502a400
master-50a8a2a
master-50cb666
master-50fae10
master-53c8434
master-53dbba7
master-54bb60e
master-55390bc
master-55bc5f0
master-563cdc3
master-56e659a
master-574406d
master-585d91a
master-58b367c
master-58e6c9f
master-5a5f8b1
master-5a8c4f6
master-5addcb1
master-5af8e32
master-5b70e7d
master-5c19c70
master-5d5817c
master-5ecff35
master-6232f2d
master-62cfc54
master-6667401
master-66aab46
master-67c7779
master-684da25
master-698f7b5
master-69b7402
master-69c9229
master-6a9661e
master-6b6dbc8
master-6bc4400
master-6c24870
master-6f1ee4b
master-6f79699
master-70269ca
master-70f01cb
master-7296c96
master-76a8849
master-77a7340
master-77efdf5
master-799fdc1
master-7a32fcb
master-7a87d31
master-7a9b6c3
master-7b8dbcb
master-7e312f1
master-7f4c5c6
master-7fc50c0
master-7ff0dcd
master-81040f1
master-83df563
master-8520fc3
master-857308d
master-859fee6
master-863f65e
master-8687c1f
master-872c365
master-87a6f84
master-884e7d7
master-8944a13
master-8a0f867
master-8a1756a
master-8b67998
master-8c2ec5e
master-8c3ffc2
master-8c9be35
master-8cda5c9
master-8cf9f34
master-8d4a855
master-90b19bd
master-9190e8e
master-928480e
master-92a6e13
master-93265e9
master-939ad2d
master-9411288
master-94c5652
master-957c8ae
master-95ea26f
master-96f9c05
master-9794052
master-986b6ce
master-99c5b27
master-9b0a4d4
master-9cbc404
master-9daff41
master-9e17072
master-9ff334f
master-a140219
master-a316a42
master-a3a2a0e
master-a4755cf
master-a5c42c4
master-a5d30b1
master-a6bdc47
master-a791a68
master-aa485ce
master-aaf3b23
master-ad072fc
master-ad5fd5b
master-ae44e23
master-afd220d
master-b1ee8f5
master-b391579
master-b3f460e
master-b51c717
master-b6e7f9b
master-b925f1f
master-be87b6e
master-bf4b22f
master-c0bb1d3
master-c12b14b
master-c1f8850
master-c2b25b6
master-c3ac702
master-c3ca7a5
master-c494ed5
master-c4f89d8
master-c4fe84f
master-c50b628
master-c56b715
master-c5aa5e5
master-c5d70f5
master-c85e03d
master-c8c2c52
master-c9a59b7
master-c9e2c26
master-cc0bb72
master-cc9cee8
master-cd7fa95
master-cea1c85
master-d0aaff5
master-d3f202d
master-d40fded
master-d502bc7
master-d5850c5
master-d7def1a
master-d990e3f
master-d9a239c
master-da5303c
master-db10808
master-dcdd65e
master-dd0eabc
master-dd7eff5
master-e0305ea
master-e216aa0
master-e2cd506
master-e4412b4
master-e4422e2
master-e4cf982
master-e6c9e09
master-e7f6997
master-e899bf5
master-e8c0516
master-e95b655
master-e986f94
master-ea10d3d
master-ea3a0ad
master-eb17a02
master-ec728e4
master-ec9cdb6
master-ecbe466
master-ed3c680
master-ee0c40d
master-eeaa7b0
master-efd0564
master-f0d70f1
master-f121705
master-f202ada
master-f266259
master-f2d1c47
master-f3d4edf
master-f4cef87
master-f4d277a
master-f5a77a6
master-f647ce0
master-f7d0509
master-f7dc43b
master-fbd4d38
${ noResults }
3 Commits (04c6f5ed6fafd63601fa06757877ed5ccf9d5991)
Author | SHA1 | Message | Date |
---|---|---|---|
comex |
563cdc391d
|
Support calling mlock() on loaded model data on Linux and macOS (#453)
* Support calling mlock() on loaded model data on Linux and macOS This is enabled by a new --mlock command line option. Using mlock() disables swapping and memory compression for the model data. Doing so can be useful on systems where the model takes up a large fraction of system RAM. In my experience, macOS is quite eager to start compressing llama.cpp's memory, which then makes it halt for a few seconds while it decompresses, even with a model that uses "only" 25GB out of 32GB. Of course, this comes at the cost of forcing the system to swap or compress other processes' memory instead, so it needs to be used with care and shouldn't be enabled by default. In theory it should be possible to support this on Windows as well using VirtualLock(), but I'm not much of a Windows user. * Update llama.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> |
2 years ago |
Luciano |
8d4a855c24
|
Add embedding mode with arg flag. Currently working (#282)
* working but ugly * add arg flag, not working on embedding mode * typo * Working! Thanks to @nullhook * make params argument instead of hardcoded boolean. remove useless time check * start doing the instructions but not finished. This probably doesnt compile * Embeddings extraction support --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> |
2 years ago |
Georgi Gerganov |
f5a77a629b
|
Introduce C-style API (#370)
* Major refactoring - introduce C-style API * Clean up * Add <cassert> * Add <iterator> * Add <algorithm> .... * Fix timing reporting and accumulation * Measure eval time only for single-token calls * Change llama_tokenize return meaning |
2 years ago |