Req: Deepseek-coder-33b-instruct Prompt Template #1082

MSZ-MGS · 2023-11-10T23:43:39Z

It seems very promising Llm, please define its prompt template.

Website for more info:
https://deepseekcoder.github.io/

Link to huggingface:
https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct

pseudotensor · 2023-11-11T02:49:25Z

Sure done, thanks for suggestion.

MSZ-MGS · 2023-11-11T11:34:20Z

Sure done, thanks for suggestion.

Thank you @pseudotensor I receive below error(attached text file).
Notes:

I have checked the Hash of the gguf and it was fine.
I have installed Llamacpp==0.2.11 (From H2oGPT docs) it didn't work same below error.
I have installed pip install https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.10+cu118-cp310-cp310-win_amd64.whl --extra-index-url https://download.pytorch.org/whl/cu117 (From H2oGPT docs) it didn't work same below error.

HOWEVER, it worked with Llamacpp 0.2.14 using the same installation method from H2oGPT docs

What do you think?! Does it make sense ?
CoderError.txt

pseudotensor · 2023-11-11T19:16:24Z

The HF model works for me.

For GGUF you have:

ERROR: byte not found in vocab: '
'
Windows fatal exception: access violation

This sounds like a bug in llama.cpp or llama_cpp_python in handling the file.

For me I run:

python generate.py --base_model=TheBloke/deepseek-coder-33B-instruct-GGUF --max_seq_len=4096 --max_new_tokens=2048 --prompt_type=deepseek_coder

and I get the same thing:

ERROR: byte not found in vocab: '
'
Fatal Python error: Segmentation fault

Current thread 0x00007f83baa29740 (most recent call first):
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/llama_cpp_cuda/llama_cpp.py", line 498 in llama_load_model_from_file
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/llama_cpp_cuda/llama.py", line 357 in __init__
  File "/home/jon/h2ogpt/src/gpt4all_llm.py", line 363 in validate_environment
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/langchain/load/serializable.py", line 97 in __init__
  File "/home/jon/h2ogpt/src/gpt4all_llm.py", line 175 in get_llm_gpt4all
  File "/home/jon/h2ogpt/src/gpt4all_llm.py", line 27 in get_model_tokenizer_gpt4all
  File "/home/jon/h2ogpt/src/gen.py", line 2043 in get_model
  File "/home/jon/h2ogpt/src/gen.py", line 1809 in get_model_retry
  File "/home/jon/h2ogpt/src/gen.py", line 1487 in main
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 691 in _CallAndUpdateTrace
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 475 in _Fire
  File "/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 141 in Fire
  File "/home/jon/h2ogpt/src/utils.py", line 65 in H2O_Fire
  File "/home/jon/h2ogpt/generate.py", line 12 in entrypoint_main
  File "/home/jon/h2ogpt/generate.py", line 16 in <module>

So they probably fixed something.

But this issue is still open: abetlen/llama-cpp-python#840

pseudotensor · 2023-11-11T19:25:52Z

I tried latest from jllllll for same version but latest 0.2.17 and it still fails in same way. Which exact link did you use?

MSZ-MGS · 2023-11-11T19:31:30Z

I tried latest from jllllll for same version but latest 0.2.17 and it still fails in same way. Which exact link did you use?

I have compiled it using your code with changing Llama version:

pip uninstall -y llama-cpp-python
set LLAMA_CUBLAS=1
set CMAKE_ARGS=-DLLAMA_CUBLAS=on
set FORCE_CMAKE=1
pip install llama-cpp-python==0.2.14 --no-cache-dir --verbose

pseudotensor · 2023-11-11T19:36:37Z

Yes with https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.14+cu118-cp310-cp310-manylinux_2_31_x86_64.whl

for whatever reason that doesn't fail in same way. Instead I get OOM.

CUDA error 2 at /home/runner/work/llama-cpp-python-cuBLAS-wheels/llama-cpp-python-cuBLAS-wheels/vendor/llama.cpp/ggml-cuda.cu:7624: out of memory
current device: 0

With 14 from jlll doing:

CUDA_VISIBLE_DEVICES=0 python generate.py --base_model=TheBloke/deepseek-coder-6.7B-instruct-GGUF --max_seq_len=4096 --max_new_tokens=2048 --prompt_type=deepseek_coder

gives:

So the OOM is expected on my 24GB board I guess for the 33B model, but the vocab error is odd and 14 from jjlll fixes or maybe if you recompile yourself fixes.

I actually don't expect it's jllll's fault, since 14 worked for me. I'm guessing llama_cpp_python or llama.cpp teams are not not stable in their code changes. I suspect jlll is using same commands all the time.

…sions of llama_cpp_python, see abetlen/llama-cpp-python#840

MSZ-MGS · 2023-11-14T20:16:39Z

@pseudotensor llama-cpp-python 0.2.18 is working fine. Anything between that and 0.2.14 are not working.
Not sure if going to 0.2.18 will add any value, this is for your kind info.

pseudotensor · 2023-11-15T08:28:18Z

Thanks. .14 was messed up too, the responses were all wrong for GGUF models. .18 is back to normal, thanks.

pseudotensor · 2023-11-15T11:18:29Z

0.18 is bad unless one builds directly.

abetlen/llama-cpp-python#912

pseudotensor mentioned this issue Nov 11, 2023

Add deepseek coder prompt #1083

Merged

pseudotensor closed this as completed in #1083 Nov 11, 2023

pseudotensor added a commit that referenced this issue Nov 11, 2023

Update for Issue #1082 -- something off going on with other newer ver…

cfd57ed

…sions of llama_cpp_python, see abetlen/llama-cpp-python#840

pseudotensor added a commit that referenced this issue Nov 15, 2023

.14 llama cpp also bad, .18 seems ok #1082 (comment)

2d22436

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Req: Deepseek-coder-33b-instruct Prompt Template #1082

Req: Deepseek-coder-33b-instruct Prompt Template #1082

MSZ-MGS commented Nov 10, 2023

pseudotensor commented Nov 11, 2023

MSZ-MGS commented Nov 11, 2023 •

edited

Loading

pseudotensor commented Nov 11, 2023 •

edited

Loading

pseudotensor commented Nov 11, 2023

MSZ-MGS commented Nov 11, 2023

pseudotensor commented Nov 11, 2023 •

edited

Loading

MSZ-MGS commented Nov 14, 2023

pseudotensor commented Nov 15, 2023

pseudotensor commented Nov 15, 2023

Req: Deepseek-coder-33b-instruct Prompt Template #1082

Req: Deepseek-coder-33b-instruct Prompt Template #1082

Comments

MSZ-MGS commented Nov 10, 2023

pseudotensor commented Nov 11, 2023

MSZ-MGS commented Nov 11, 2023 • edited Loading

pseudotensor commented Nov 11, 2023 • edited Loading

pseudotensor commented Nov 11, 2023

MSZ-MGS commented Nov 11, 2023

pseudotensor commented Nov 11, 2023 • edited Loading

MSZ-MGS commented Nov 14, 2023

pseudotensor commented Nov 15, 2023

pseudotensor commented Nov 15, 2023

MSZ-MGS commented Nov 11, 2023 •

edited

Loading

pseudotensor commented Nov 11, 2023 •

edited

Loading

pseudotensor commented Nov 11, 2023 •

edited

Loading