In general, it seems to work, but I think it fails for unicode character support. Hopefully, someone can help with that
- I don't know yet how much the quantization affects the quality of the generated text
- I don't know yet how much the quantization affects the quality of the generated text
- Probably the token sampling can be improved
- Probably the token sampling can be improved
- The Accelerate framework is actually currently unused since I found that for tensor shapes typical for the Decoder,
- The Accelerate framework is actually currently unused since I found that for tensor shapes typical for the Decoder,
@ -192,16 +183,15 @@ Note the use of `--color` to distinguish between user input and generated text.
### Contributing
### Contributing
- There are 2 git branches: [master](https://github.com/ggerganov/llama.cpp/commits/master) and [dev](https://github.com/ggerganov/llama.cpp/commits/dev)
- Contributors can open PRs
- Contributors can open PRs to either one
- Collaborators can push to branches in the `llama.cpp` repo
- Collaborators can push straight into `dev`, but need to open a PR to get stuff to `master`
- Collaborators will be invited based on contributions
- Collaborators will be invited based on contributions
- `dev` branch is considered unstable
- `master` branch is considered stable and approved. 3-rd party projects should use the `master` branch
General principles to follow when writing code:
### Coding guide-lines
- Avoid adding third-party dependencies, extra files, extra headers, etc.
- Avoid adding third-party dependencies, extra files, extra headers, etc.
- Always consider cross-compatibility with other operating systems and architectures
- Always consider cross-compatibility with other operating systems and architectures
- Avoid fancy looking modern STL constructs, use basic for loops, avoid templates, keep it simple
- Avoid fancy looking modern STL constructs, use basic for loops, avoid templates, keep it simple
- There are no strict rules for the code style, but try to follow the patterns in the code (indentation, spaces, etc.). Vertical alignment makes things more readable and easier to batch edit
- There are no strict rules for the code style, but try to follow the patterns in the code (indentation, spaces, etc.). Vertical alignment makes things more readable and easier to batch edit
- Clean-up any tailing whitespaces, use 4 spaces indentation, brackets on same line, `int * var`
- Look at the [good first issues](https://github.com/ggerganov/llama.cpp/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) for tasks