Skip to content

Releases: xhedit/llama-cpp-conv

recook-metal

30 Apr 17:29
3781eb3
Compare
Choose a tag to compare
Update publish.yaml

recook-cu124

30 Apr 17:26
3781eb3
Compare
Choose a tag to compare
Update publish.yaml

recook-cu123

30 Apr 17:26
3781eb3
Compare
Choose a tag to compare
Update publish.yaml

recook-cu122

30 Apr 17:26
3781eb3
Compare
Choose a tag to compare
Update publish.yaml

recook-cu121

30 Apr 17:26
3781eb3
Compare
Choose a tag to compare
Update publish.yaml

recook

30 Apr 19:57
5df7a07
Compare
Choose a tag to compare

dev-metal

06 Apr 23:57
bf766bd
Compare
Choose a tag to compare
merge test (#2)

* feat: add support for KV cache quantization options (#1307)

* add KV cache quantization options

https://github.com/abetlen/llama-cpp-python/discussions/1220
https://github.com/abetlen/llama-cpp-python/issues/1305

* Add ggml_type

* Use ggml_type instead of string for quantization

* Add server support

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>

* fix: Changed local API doc references to hosted (#1317)

* chore: Bump version

* fix: last tokens passing to sample_repetition_penalties function (#1295)

Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Andrei <abetlen@gmail.com>

* feat: Update llama.cpp

* fix: segfault when logits_all=False. Closes #1319

* feat: Binary wheels for CPU, CUDA (12.1 - 12.3), Metal (#1247)

* Generate binary wheel index on release

* Add total release downloads badge

* Update download label

* Use official cibuildwheel action

* Add workflows to build CUDA and Metal wheels

* Update generate index workflow

* Update workflow name

* feat: Update llama.cpp

* chore: Bump version

* fix(ci): use correct script name

* docs: LLAMA_CUBLAS -> LLAMA_CUDA

* docs: Add docs explaining how to install pre-built wheels.

* docs: Rename cuBLAS section to CUDA

* fix(docs): incorrect tool_choice example (#1330)

* feat: Update llama.cpp

* fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 #1314

* fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 Closes #1314

* feat: Update llama.cpp

* fix: Always embed metal library. Closes #1332

* feat: Update llama.cpp

* chore: Bump version

---------

Co-authored-by: Limour <93720049+Limour-dev@users.noreply.github.com>
Co-authored-by: Andrei Betlen <abetlen@gmail.com>
Co-authored-by: lawfordp2017 <lawfordp@gmail.com>
Co-authored-by: Yuri Mikhailov <bitsharp@gmail.com>
Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

dev-cu123

06 Apr 23:41
bf766bd
Compare
Choose a tag to compare
merge test (#2)

* feat: add support for KV cache quantization options (#1307)

* add KV cache quantization options

https://github.com/abetlen/llama-cpp-python/discussions/1220
https://github.com/abetlen/llama-cpp-python/issues/1305

* Add ggml_type

* Use ggml_type instead of string for quantization

* Add server support

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>

* fix: Changed local API doc references to hosted (#1317)

* chore: Bump version

* fix: last tokens passing to sample_repetition_penalties function (#1295)

Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Andrei <abetlen@gmail.com>

* feat: Update llama.cpp

* fix: segfault when logits_all=False. Closes #1319

* feat: Binary wheels for CPU, CUDA (12.1 - 12.3), Metal (#1247)

* Generate binary wheel index on release

* Add total release downloads badge

* Update download label

* Use official cibuildwheel action

* Add workflows to build CUDA and Metal wheels

* Update generate index workflow

* Update workflow name

* feat: Update llama.cpp

* chore: Bump version

* fix(ci): use correct script name

* docs: LLAMA_CUBLAS -> LLAMA_CUDA

* docs: Add docs explaining how to install pre-built wheels.

* docs: Rename cuBLAS section to CUDA

* fix(docs): incorrect tool_choice example (#1330)

* feat: Update llama.cpp

* fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 #1314

* fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 Closes #1314

* feat: Update llama.cpp

* fix: Always embed metal library. Closes #1332

* feat: Update llama.cpp

* chore: Bump version

---------

Co-authored-by: Limour <93720049+Limour-dev@users.noreply.github.com>
Co-authored-by: Andrei Betlen <abetlen@gmail.com>
Co-authored-by: lawfordp2017 <lawfordp@gmail.com>
Co-authored-by: Yuri Mikhailov <bitsharp@gmail.com>
Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

dev-cu122

06 Apr 23:40
bf766bd
Compare
Choose a tag to compare
merge test (#2)

* feat: add support for KV cache quantization options (#1307)

* add KV cache quantization options

https://github.com/abetlen/llama-cpp-python/discussions/1220
https://github.com/abetlen/llama-cpp-python/issues/1305

* Add ggml_type

* Use ggml_type instead of string for quantization

* Add server support

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>

* fix: Changed local API doc references to hosted (#1317)

* chore: Bump version

* fix: last tokens passing to sample_repetition_penalties function (#1295)

Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Andrei <abetlen@gmail.com>

* feat: Update llama.cpp

* fix: segfault when logits_all=False. Closes #1319

* feat: Binary wheels for CPU, CUDA (12.1 - 12.3), Metal (#1247)

* Generate binary wheel index on release

* Add total release downloads badge

* Update download label

* Use official cibuildwheel action

* Add workflows to build CUDA and Metal wheels

* Update generate index workflow

* Update workflow name

* feat: Update llama.cpp

* chore: Bump version

* fix(ci): use correct script name

* docs: LLAMA_CUBLAS -> LLAMA_CUDA

* docs: Add docs explaining how to install pre-built wheels.

* docs: Rename cuBLAS section to CUDA

* fix(docs): incorrect tool_choice example (#1330)

* feat: Update llama.cpp

* fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 #1314

* fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 Closes #1314

* feat: Update llama.cpp

* fix: Always embed metal library. Closes #1332

* feat: Update llama.cpp

* chore: Bump version

---------

Co-authored-by: Limour <93720049+Limour-dev@users.noreply.github.com>
Co-authored-by: Andrei Betlen <abetlen@gmail.com>
Co-authored-by: lawfordp2017 <lawfordp@gmail.com>
Co-authored-by: Yuri Mikhailov <bitsharp@gmail.com>
Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

dev-cu121

06 Apr 23:40
bf766bd
Compare
Choose a tag to compare
merge test (#2)

* feat: add support for KV cache quantization options (#1307)

* add KV cache quantization options

https://github.com/abetlen/llama-cpp-python/discussions/1220
https://github.com/abetlen/llama-cpp-python/issues/1305

* Add ggml_type

* Use ggml_type instead of string for quantization

* Add server support

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>

* fix: Changed local API doc references to hosted (#1317)

* chore: Bump version

* fix: last tokens passing to sample_repetition_penalties function (#1295)

Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Andrei <abetlen@gmail.com>

* feat: Update llama.cpp

* fix: segfault when logits_all=False. Closes #1319

* feat: Binary wheels for CPU, CUDA (12.1 - 12.3), Metal (#1247)

* Generate binary wheel index on release

* Add total release downloads badge

* Update download label

* Use official cibuildwheel action

* Add workflows to build CUDA and Metal wheels

* Update generate index workflow

* Update workflow name

* feat: Update llama.cpp

* chore: Bump version

* fix(ci): use correct script name

* docs: LLAMA_CUBLAS -> LLAMA_CUDA

* docs: Add docs explaining how to install pre-built wheels.

* docs: Rename cuBLAS section to CUDA

* fix(docs): incorrect tool_choice example (#1330)

* feat: Update llama.cpp

* fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 #1314

* fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 Closes #1314

* feat: Update llama.cpp

* fix: Always embed metal library. Closes #1332

* feat: Update llama.cpp

* chore: Bump version

---------

Co-authored-by: Limour <93720049+Limour-dev@users.noreply.github.com>
Co-authored-by: Andrei Betlen <abetlen@gmail.com>
Co-authored-by: lawfordp2017 <lawfordp@gmail.com>
Co-authored-by: Yuri Mikhailov <bitsharp@gmail.com>
Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>