-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama : add llama_vocab
, functions -> methods, naming
#11110
Conversation
287e8c2
to
4d27597
Compare
I'm stumped by this error in the CI: Not sure what is causing it. @slaren Do you have any guess what could be the issue? |
Looks like a compiler bug, but I cannot reproduce it locally. |
I think it's safe to assume that this is a compiler bug. It only happens with clang on Windows, and only on release builds. Changing the generator from "ninja multi-config" to "ninja" fixes it for the arm build, but not with hip or sycl. The destructors that it complains about seem to be the maps in |
Thanks! |
b7f2c02
to
be9a25f
Compare
403dee8
to
aefcffa
Compare
c89e808
to
bfe781a
Compare
1d1f264
to
a857dc5
Compare
ggml-ci Co-authored-by: Diego Devesa <slarengh@gmail.com>
This should be ready to merge. These are 5 PRs refactoring This set of PRs, together with #11167 and #11174 all introduce API changes, so it would make sense to merge them together to avoid multiple breaking changes. After this is merged, I will attempt a similar refactoring for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I think it is very good to settle on C++ OOP style rather than the current mix of C and C++. Some suggestions for further refactoring:
- Move the code from
llama_model_load_from_file
to thellama_model
constructor - Move
llama_vocab::load
to the constructor struct
with private members is not very common, might be better to useclass
- I don't see the point of declaring empty destructors, they can be removed or explicitly set to
default
. Very few classes need a destructor. - At some point we should abstract eveything needed to model an architecture to a single class (such that each architecture is a subclass of this class)
- After that,
llm_type
should probably be removed entirely, and each architecture should have its own enum if needed, with a function to return the type as a string (which by default could be"<arch> <params>"
)
* llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci]
d8ded9f
to
6540935
Compare
llama_vocab
, functions -> methods, naming
LLAMA_LOG_WARN( | ||
"%s: Added a BOS token to the prompt as specified by the model but the prompt " | ||
"also starts with a BOS token. So now the final prompt starts with 2 BOS tokens. " | ||
"Are you sure this is what you want?\n", __FUNCTION__); | ||
} | ||
if (vocab.tokenizer_add_eos && output.size() >= 2 && *(output.end()-2) == vocab.special_eos_id) { | ||
if (vocab.get_add_bos() && output.size() >= 2 && *(output.end()-2) == vocab.token_eos()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vocab.get_add_bos()
probably a typo @ggerganov ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok - it means "get the add_bos
flag". The "get" is necessary to disambiguate from the action of adding a BOS vocab.add_bos()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ggerganov what I mean is that you are checking for eos, but using vocab.get_add_bos()
. Should it not be vocab.get_add_eos()
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes. Thanks for spotting - fixing.
* llama : functions -> methods (ggml-org#11110) * llama : add struct llama_vocab to the API (ggml-org#11156) ggml-ci * hparams : move vocab params to llama_vocab (ggml-org#11159) ggml-ci * vocab : more pimpl (ggml-org#11165) ggml-ci * vocab : minor tokenization optimizations (ggml-org#11160) ggml-ci Co-authored-by: Diego Devesa <slarengh@gmail.com> * lora : update API names (ggml-org#11167) ggml-ci * llama : update API names to use correct prefix (ggml-org#11174) * llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci] * vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (ggml-org#11174) ggml-ci * vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (ggml-org#11174) ggml-ci --------- Co-authored-by: Diego Devesa <slarengh@gmail.com>
* llama : functions -> methods (ggml-org#11110) * llama : add struct llama_vocab to the API (ggml-org#11156) ggml-ci * hparams : move vocab params to llama_vocab (ggml-org#11159) ggml-ci * vocab : more pimpl (ggml-org#11165) ggml-ci * vocab : minor tokenization optimizations (ggml-org#11160) ggml-ci Co-authored-by: Diego Devesa <slarengh@gmail.com> * lora : update API names (ggml-org#11167) ggml-ci * llama : update API names to use correct prefix (ggml-org#11174) * llama : update API names to use correct prefix ggml-ci * cont ggml-ci * cont ggml-ci * minor [no ci] * vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (ggml-org#11174) ggml-ci * vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (ggml-org#11174) ggml-ci --------- Co-authored-by: Diego Devesa <slarengh@gmail.com>
This PR refactors
struct llama_model
andstruct llama_vocab
related functionality. Moved the tensor data loading tosrc/llama-model.cpp
. Thesrc/llama.cpp
now contains primarily graph build logic.struct llama_vocab
is now public in thellama
API and the respective calls use it instead ofstruct llama_model
. Improved naming consistency in the public API.Sub-PRs
struct llama_vocab
to the API #11156API changes
Multiple naming changes in the
llama_context
andllama_model
API. Old names have been deprecated.Add
struct llama_vocab
to the APIllama_n_vocab()
now acceptsstruct llama_vocab
instead ofstruct llama_model
llama_sampler_init_dry()
now acceptsstruct llama_vocab
instead ofstruct llama_model
The tokenization API now accepts
struct llama_vocab
instead ofstruct llama_model
The sampler API now accepts
struct llama_vocab
instead ofstruct llama_model
Update API names for improved consistency:
Adapter API
Multiple naming changes in this API. Skipped the deprecation phase.
struct llama_control_vector
->struct llama_adapter_cvec
struct llama_lora_adapter
->struct llama_adapter_lora
llama_lora_adapter_[verb](ctx, ...)
->llama_[verb]_adapter_lora(ctx, ...)
Migration instructions
Adapting user code to the changes is fairly straightforward:
llama_model_get_vocab(model)
where the old API requiredllama_model
and the new API requiresllama_vocab