Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensor 'tok_embeddings.weight' has wrong size in model file #14

Closed
shaunabanana opened this issue Mar 11, 2023 · 1 comment
Closed

tensor 'tok_embeddings.weight' has wrong size in model file #14

shaunabanana opened this issue Mar 11, 2023 · 1 comment
Labels
build Compilation issues

Comments

@shaunabanana
Copy link

When trying to run the 13B model the following output is given:

main: seed = 1678543550
llama_model_load: loading model from './models/13B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 5120
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 40
llama_model_load: n_layer = 40
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 13824
llama_model_load: ggml ctx size = 8559.49 MB
llama_model_load: memory_size =   800.00 MB, n_mem = 20480
llama_model_load: tensor 'tok_embeddings.weight' has wrong size in model file
main: failed to load model from './models/13B/ggml-model-q4_0.bin'

I have followed the commands in the readme to quantize the model, i.e.:

python3 convert-pth-to-ggml.py models/13B/ 1
./quantize ./models/13B/ggml-model-f16.bin   ./models/13B/ggml-model-q4_0.bin 2
./quantize ./models/13B/ggml-model-f16.bin.1 ./models/13B/ggml-model-q4_0.bin.1 2

I am using a M1 MacBook Pro. Any thoughts on how to resolve this issue?

@shaunabanana
Copy link
Author

I just realized that I had been using binaries (quantize and main) compiled from a previous version. Recompiling solved the issue. Thank you for your awesome work!

@gjmulder gjmulder added the build Compilation issues label Mar 15, 2023
abetlen pushed a commit to abetlen/llama.cpp that referenced this issue Apr 10, 2023
Update macOS, better instructions, streaming output
SlyEcho pushed a commit to SlyEcho/llama.cpp that referenced this issue Jun 2, 2023
Trim partial stopping strings when not streaming and move multibyte check.
chsasank pushed a commit to chsasank/llama.cpp that referenced this issue Dec 20, 2023
* Update LICENSE with our copyright notice

* Update README.md

* fix readme anchor

* Update README.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Compilation issues
Projects
None yet
Development

No branches or pull requests

2 participants