-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow flatten_messages_as_text to be pushed via LiteLLMModel to fix Ollama integration #406
Allow flatten_messages_as_text to be pushed via LiteLLMModel to fix Ollama integration #406
Conversation
Tested the fix and seem's ok. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
somehow it seems to me a bit odd to add as a class attribute. this arg and the utils about this arg was made because LLM and VLM chat templates are different in transformers and this arg indicates to format them to pass to model.
anyhow if this completely fixes for both LLM and VLM I'd rather not make this class attribute but rather just pass True
anyway.
but it's up to @aymeric-roucher to decide
@merveenoyan I was hoping it can be injected via kwargs here: https://github.com/huggingface/smolagents/blob/main/src/smolagents/models.py#L683 I don't know why I did not make and private ( |
The key distrinction here and reason for errors is that VLMs need a list in "content" (like We solved this issue for Maybe there could be a mean to do this as well in Ollama models ? Maybe actually we could even perform this detection by checking if the Transformers version of the model has a |
59bb664
to
4b54dfa
Compare
@aymeric-roucher I think it might be a bit too error-prone if add a detection of this sort. At least I don't see a reliable way of doing it :/ So either we can go with a separate Ollama model class (#368) or the hack we have in this PR. |
@aymeric-roucher This fixes most ollama related issues for non-vision (tested e.g. #264, works), but using VLMs with ollama and actually passing images does not seem to work with either value of flatten_massages_as_text. I tested both endpoints with Since the current released status is, ollama does not work at all without this fix ( |
Thank you @RolandJAAI for the additional context: agree with the proposed solution: |
@aymeric-roucher so no need for separate Ollama model class (#368)? |
4b54dfa
to
ef286ef
Compare
@RolandJAAI updated the PR to check for the model_id. Just decided to use |
@RolandJAAI can you open a reproducible issue and assign to me, I'll look into it |
Thinking about this issue and how it appeared for several models in the past, I would suggest:
What do you think? |
@albertvillanova the combined option could have been: def __init__( # ...
_flatten_messages_as_text: Optional[bool] = None,
):
self._flatten_messages_as_text = _flatten_messages_as_text if _flatten_messages_as_text is not None else self.model_id.startswith("ollama")
def __call__(
// ...
**kwargs,
) -> ChatMessage:
import litellm
completion_kwargs = self._prepare_completion_kwargs(
messages=messages,
# ...
convert_images_to_image_urls=True,
flatten_messages_as_text=self._flatten_messages_as_text,
custom_role_conversions=self.custom_role_conversions,
**kwargs,
) That would let the user to override, otherwise autodetection is used based on the model_id. This can be extracted into a function so that users can override the class (using a property of a function call: def __call__(...) -> ChatMessage:
import litellm
completion_kwargs = self._prepare_completion_kwargs(
messages=messages,
# ...
convert_images_to_image_urls=True,
flatten_messages_as_text=self._flatten_messages_as_text,
**kwargs,
)
@property
def _flatten_messages_as_text(self) -> bool {
return self._force_message_flatenning if self._force_message_flatenning is not None else self.model_id.startswith("ollama") |
@albertvillanova Separating the OllamaModel class isn’t a good option? #368 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing this. Let's merge it as is for a patch release and leave a more robust solution for a subsequent PR.
@touseefahmed96, in principle, as a general rule, I would say a dedicated class might be necessary only for different model frameworks, and not for each model ID, so that we avoid having too many model classes. But let's continue this discussion in your PR! |
Just updated to 1.6.0 With Pip install smolagents --upgrade Also updated litellm same way. Now if I use ollama_chat it will initially connect but all the functionality is worked and it throws a bunch of errors. In the end the only way I can get it to work is to get rid of the '_chat' part. So basically, now it's a little bit improved but I wouldn't consider this bug fixed? Unless I am doing something super wrong. |
1.6.0 is more than a week old. |
@tolkienist42 my bugfix hasn't made it to the release yet: v1.7.0...6d72ea7 |
If a structured message is sent to ollama using:
I receive the following error:
The problem is that by default a structured message will be sent, whilst ollama expects it to be flat.
So to fix that I decided to utilise the existing flattening of messages:
Maybe not the best solution, but worked for me.