-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama-cli misbehaving (changed?) #12036
Comments
For reference, this is the output I get with the same model using b4000:
|
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release -j$(nproc)
wget https://huggingface.co/ggml-org/gemma-1.1-7b-it-Q4_K_M-GGUF/resolve/main/gemma-1.1-7b-it.Q4_K_M.gguf -O gemma-7b-q4.gguf
./build/bin/llama-cli -m gemma-7b-q4.gguf -c 1024 -p "Once upon a time" Output:
Is the documentation outdated? https://github.com/ggml-org/llama.cpp/blob/master/examples/main/README.md |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I have a colab notebook here to quantize and the test models:
https://colab.research.google.com/drive/1TcyGL60GQzsxEHu-Xlos5u8bb_6SxMa3
The simple test has always been this line:
Usually, after the initialization the models start answering. (and then even continuing on their own... which is fine).
Now ( b4762 ) instead it does this:
Am I doing something wrong?
Note:
if I use b4000 everything works as usual.
The text was updated successfully, but these errors were encountered: