Commit Graph

4 Commits (a50e39c6fe36be3de0941b3c05aaf9c37912fd47)

Author SHA1 Message Date
Stephan Walter 69c92298a9
Deduplicate q4 quantization functions (#383)
* Deduplicate q4 quantization functions

* Use const; add basic test

* Re-enable quantization test

* Disable AVX2 flags in CI

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Georgi Gerganov f5a77a629b
Introduce C-style API (#370)
* Major refactoring - introduce C-style API

* Clean up

* Add <cassert>

* Add <iterator>

* Add <algorithm> ....

* Fix timing reporting and accumulation

* Measure eval time only for single-token calls

* Change llama_tokenize return meaning
1 year ago
hoangmit 6eac39ba95
Add RMS norm and use it (#187)
* add ggml_rms_norm

* update op num
1 year ago
Georgi Gerganov 26c0846629
Initial release 1 year ago