-
Notifications
You must be signed in to change notification settings - Fork 688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rust] Add GTE and Gemma2 model #3422
Conversation
@@ -119,7 +119,7 @@ def save_rust_model(self, model_info, args: Namespace, temp_dir: str, | |||
if hasattr(config, "model_type"): | |||
if config.model_type not in [ | |||
"bert", "camembert", "distilbert", "xlm-roberta", | |||
"roberta", "nomic_bert", "mistral", "qwen2" | |||
"roberta", "nomic_bert", "mistral", "qwen2", "new" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think "new" is a valid model type, there is a better way to detect GTE model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5/blob/main/config.json
GTE model can be detected either "new" model type or "NewModel" architecture.
@@ -119,7 +119,8 @@ def save_rust_model(self, model_info, args: Namespace, temp_dir: str, | |||
if hasattr(config, "model_type"): | |||
if config.model_type not in [ | |||
"bert", "camembert", "distilbert", "xlm-roberta", | |||
"roberta", "nomic_bert", "mistral", "qwen2" | |||
"roberta", "nomic_bert", "mistral", "qwen2", "new", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO here, we need monitor if Transformer will give a better model type in few, or if there is conflict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
Description
Brief description of what this PR is about