You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
themanyone
changed the title
CUDA crash in llama_decode_internal, when using -ngl with Phi-3
CUDA: Generate error message for unsupported quantizations like iq4_nl
Jun 16, 2024
The Problem
llama.cpp
crashes instead of reporting that it does not support iq4_nl quantization../llama-cli -ngl 1 -m ~/.local/share/models/Phi-3-mini-4k-instruct-IQ4_NL.gguf
available from https://huggingface.co/lmstudio-community/Phi-3-mini-4k-instruct-GGUF/tree/main
Expected results
"Example: IQ4_NL does not support -ngl. Please run without -ngl flag"
Current Behavior
The problem
Toward the end of ggml-cuda/dmmv.cu:665, it aborts on this assertion.
$ lscpu
cpu.txt
lspci
...
$ uname -a
Linux fedora 6.8.9-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Thu May 2 18:44:19 UTC 2024 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: