-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPT2 Architecture Integration #4073
Comments
I started, but could not get it to work. The model outputs something, but just gibberish. I lack the documentation into LLama.cpp and the C++ skills to really finish this, but maybe someone has an idea on how to get it over the line. This is how far I got in my own fork. I based my implementation mainly on the Starcoder class, because the architecture is quite similar. I took inspiration from mmnga's fork, who implemented it in an older version. From my understanding, you need to modify the following elements in the code. Serializing the model using convert-hf-to-gguf.py
Adding the mappings in gguf-py/gguf/constants.py and gguf-py/gguf/tensor_mapping.py
Adjust the backend file llama.cpp
|
Which model are you using? I tried https://huggingface.co/gpt2 and https://huggingface.co/gpt2-medium/tree/main, but they fail to convert, once I added missing properties, they still miss
|
As a sidenote def set_vocab(self):
self._set_vocab_sentencepiece() Should most likely be def set_vocab(self):
self._set_vocab_gpt2() |
@Galunid Thanks for having a look into this 👍 I first started with one of the models from AI Sweden, which is based on GPT2. But I realised they have a few specifics, so I made new commit with a few changes to make it compatible with the original GPT2.
The other thing, as you mentioned, is the lack of an output layer. I extracted it from the model and wrote it to the safetensor file (code below) . But wasn't sure how to fit it into the codebase best. Overall it runs through, but the output is still somewhat gibberish.
Code to add the output layer to safetensors.
|
Would be great to add GPT2 arch to |
I'd like to help with this |
Feature Description
The idea is to be able to convert models using the GPT2 architecture into GGUF. The convert-hf-to-gguf.py should include GPT2, as well as llama.cpp for running the model.
Motivation
There are quite a few models for low resource languages or specific use cases that are fine-tuned on GPT2 architecture.
Possible Implementation
The structure of models is quite similar to Starcoder. From my understanding, you can modify it quite easily by:
convert-hf-to-gguf.py
llama.cpp
Status
I tried implementing that myself, but am not deep enough into the topic and find it quite hard to understand the libraries structure (is there any good documentation). So, I am probably not able to pull this off by myself, but am happy to support!
The text was updated successfully, but these errors were encountered: