Allow flatten_messages_as_text to be pushed via LiteLLMModel to fix Ollama integration #406

sysradium · 2025-01-28T23:22:02Z

If a structured message is sent to ollama using:

model = LiteLLMModel(
    model_id="ollama_chat/mistral"
    api_base="http://127.0.0.1:11434",
    num_ctx=8192,
)

I receive the following error:

Error in generating model output:
litellm.APIConnectionError: Ollama_chatException - Client error '400 Bad Request' for url 'http://localhost:11434/api/chat'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400

The problem is that by default a structured message will be sent, whilst ollama expects it to be flat.
So to fix that I decided to utilise the existing flattening of messages:

model = LiteLLMModel(
    model_id="ollama_chat/mistral"
    api_base="http://127.0.0.1:11434",
    num_ctx=8192,
    flatten_messages_as_text=True, # now you can pass this if using ollama
)

Maybe not the best solution, but worked for me.

SixK · 2025-01-29T00:03:06Z

Tested the fix and seem's ok.
Thank's for the fix

merveenoyan

somehow it seems to me a bit odd to add as a class attribute. this arg and the utils about this arg was made because LLM and VLM chat templates are different in transformers and this arg indicates to format them to pass to model.
anyhow if this completely fixes for both LLM and VLM I'd rather not make this class attribute but rather just pass True anyway.

but it's up to @aymeric-roucher to decide

sysradium · 2025-01-29T16:48:22Z

@merveenoyan I was hoping it can be injected via kwargs here: https://github.com/huggingface/smolagents/blob/main/src/smolagents/models.py#L683
But there was no way to manipulate them.

I don't know why I did not make and private (self._ flatten_messages_as_text), but even if I do, I agree, that it is a bit awkward. But I wasn't sure if hardcoding True value does not break anything.

aymeric-roucher · 2025-01-29T17:04:05Z

The key distrinction here and reason for errors is that VLMs need a list in "content" (like [{"type": "text", "text": "Hello!"}, {"type": "image", "image": base64_image}]) while LLMs need a string in "content".

We solved this issue for TransformersModel by auto-detecting the model type, by checking if it had a .processor attribute. Then if the model was a LLM (self.is_vlm is False), parameter flatten_messages_as_text would be always set to True.

Maybe there could be a mean to do this as well in Ollama models ? Maybe actually we could even perform this detection by checking if the Transformers version of the model has a tokenizer or a processor attribute. but it's heavy to load the Transformers model, so not sure if we can easily check the existence of a processor.

sysradium · 2025-01-29T18:21:01Z

@aymeric-roucher I think it might be a bit too error-prone if add a detection of this sort. At least I don't see a reliable way of doing it :/

So either we can go with a separate Ollama model class (#368) or the hack we have in this PR.

RolandJAAI · 2025-01-29T22:48:16Z

@aymeric-roucher This fixes most ollama related issues for non-vision (tested e.g. #264, works), but using VLMs with ollama and actually passing images does not seem to work with either value of flatten_massages_as_text. I tested both endpoints with ollama_chat/llava and ollama/llava with the smolagents dev version and this fix, getting different errors depending on the endpoint and value of flatten_messages_as_text, but no success, while Claude works fine. @sysradium does it work for you with images?

Since the current released status is, ollama does not work at all without this fix (ollama_chat thows a 400, ollama/ works for messages but the tools fail), you could even do if "ollama" in self.model_id to decide to set self.flatten_messages_as_text=True to make the hack complete and prevent more issues being raised. Ollama model class might be a good idea to get rid of the hack next, or PR in LiteLLM.

aymeric-roucher · 2025-01-30T00:21:40Z

Thank you @RolandJAAI for the additional context: agree with the proposed solution: if "ollama" in self.model_id, set flatten_messages_as_text to True

touseefahmed96 · 2025-01-30T06:53:26Z

@aymeric-roucher so no need for separate Ollama model class (#368)?

sysradium · 2025-02-02T16:04:14Z

@RolandJAAI updated the PR to check for the model_id. Just decided to use startswith instead of in.

merveenoyan · 2025-02-03T11:56:27Z

@RolandJAAI can you open a reproducible issue and assign to me, I'll look into it

albertvillanova · 2025-02-03T14:17:03Z

Thinking about this issue and how it appeared for several models in the past, I would suggest:

propose some automatic fixes under the hood, so users don't need to worry about this issue
- as we did for vision/text transformers models
- as we could also do for ollama models now
but at the same time, always allow users to customize the flatten_message_as_text variable
- this will be helpful for other models in the future that might be problematic
- the users always have a workaround (setting its value manually), instead of having to wait for the next smolagents patch release

What do you think?

sysradium · 2025-02-03T15:19:30Z

@albertvillanova the combined option could have been:

def __init__( # ...
_flatten_messages_as_text: Optional[bool] = None,
):
    self._flatten_messages_as_text =  _flatten_messages_as_text if _flatten_messages_as_text is not None else self.model_id.startswith("ollama")

def __call__(
    // ...
    **kwargs,
) -> ChatMessage:
    import litellm
    completion_kwargs = self._prepare_completion_kwargs(
        messages=messages,
        # ...
        convert_images_to_image_urls=True,
        flatten_messages_as_text=self._flatten_messages_as_text,
        custom_role_conversions=self.custom_role_conversions,
        **kwargs,
    )

That would let the user to override, otherwise autodetection is used based on the model_id. This can be extracted into a function so that users can override the class (using a property of a function call:

def __call__(...) -> ChatMessage:
    import litellm
    completion_kwargs = self._prepare_completion_kwargs(
        messages=messages,
        # ...
        convert_images_to_image_urls=True,
        flatten_messages_as_text=self._flatten_messages_as_text,
        **kwargs,
    )

@property
def _flatten_messages_as_text(self) -> bool {
    return self._force_message_flatenning if self._force_message_flatenning is not None else self.model_id.startswith("ollama")

touseefahmed96 · 2025-02-03T15:44:20Z

@albertvillanova Separating the OllamaModel class isn’t a good option? #368

albertvillanova

Thanks for addressing this. Let's merge it as is for a patch release and leave a more robust solution for a subsequent PR.

albertvillanova · 2025-02-03T16:01:36Z

@touseefahmed96, in principle, as a general rule, I would say a dedicated class might be necessary only for different model frameworks, and not for each model ID, so that we avoid having too many model classes. But let's continue this discussion in your PR!

tolkienist42 · 2025-02-04T02:57:16Z

Just updated to 1.6.0

With

Pip install smolagents --upgrade

Also updated litellm same way.
And updated to latest ollama.

Now if I use ollama_chat it will initially connect but all the functionality is worked and it throws a bunch of errors. In the end the only way I can get it to work is to get rid of the '_chat' part.

So basically, now it's a little bit improved but I wouldn't consider this bug fixed? Unless I am doing something super wrong.

SixK · 2025-02-04T09:54:14Z

1.6.0 is more than a week old.
Latest version 1.7.0 has been released 4 days ago.
Fix is 17hours old, so it may be released in 1.8.0.

sysradium · 2025-02-04T22:31:35Z

@tolkienist42 my bugfix hasn't made it to the release yet: v1.7.0...6d72ea7

sysradium marked this pull request as ready for review January 28, 2025 23:22

sysradium mentioned this pull request Jan 28, 2025

v1.5.0 not working with ollama #354

Closed

merveenoyan reviewed Jan 29, 2025

View reviewed changes

merveenoyan requested a review from aymeric-roucher January 29, 2025 16:28

sysradium force-pushed the fix-ollama-integration branch from 59bb664 to 4b54dfa Compare January 29, 2025 18:17

fix ollama integration by forcing flat messages

ef286ef

sysradium force-pushed the fix-ollama-integration branch from 4b54dfa to ef286ef Compare February 2, 2025 16:01

Merge branch 'main' into fix-ollama-integration

8356fd1

sysradium mentioned this pull request Feb 3, 2025

flatten string as attribute for litellm model and ollama compatibility #462

Open

albertvillanova approved these changes Feb 3, 2025

View reviewed changes

albertvillanova merged commit 6d72ea7 into huggingface:main Feb 3, 2025
3 checks passed

albertvillanova linked an issue Feb 3, 2025 that may be closed by this pull request

v1.5.0 not working with ollama #354

Closed

sysradium deleted the fix-ollama-integration branch February 4, 2025 22:31

RolandJAAI mentioned this pull request Feb 7, 2025

LiteLLM ollama bugs Update #551

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow flatten_messages_as_text to be pushed via LiteLLMModel to fix Ollama integration #406

Allow flatten_messages_as_text to be pushed via LiteLLMModel to fix Ollama integration #406

sysradium commented Jan 28, 2025 •

edited

Loading

SixK commented Jan 29, 2025

merveenoyan left a comment •

edited

Loading

sysradium commented Jan 29, 2025

aymeric-roucher commented Jan 29, 2025 •

edited

Loading

sysradium commented Jan 29, 2025 •

edited

Loading

RolandJAAI commented Jan 29, 2025 •

edited

Loading

aymeric-roucher commented Jan 30, 2025

touseefahmed96 commented Jan 30, 2025

sysradium commented Feb 2, 2025

merveenoyan commented Feb 3, 2025

albertvillanova commented Feb 3, 2025 •

edited

Loading

sysradium commented Feb 3, 2025 •

edited

Loading

touseefahmed96 commented Feb 3, 2025 •

edited

Loading

albertvillanova left a comment

albertvillanova commented Feb 3, 2025

tolkienist42 commented Feb 4, 2025

SixK commented Feb 4, 2025

sysradium commented Feb 4, 2025 •

edited

Loading

Allow flatten_messages_as_text to be pushed via LiteLLMModel to fix Ollama integration #406

Allow flatten_messages_as_text to be pushed via LiteLLMModel to fix Ollama integration #406

Conversation

sysradium commented Jan 28, 2025 • edited Loading

SixK commented Jan 29, 2025

merveenoyan left a comment • edited Loading

Choose a reason for hiding this comment

sysradium commented Jan 29, 2025

aymeric-roucher commented Jan 29, 2025 • edited Loading

sysradium commented Jan 29, 2025 • edited Loading

RolandJAAI commented Jan 29, 2025 • edited Loading

aymeric-roucher commented Jan 30, 2025

touseefahmed96 commented Jan 30, 2025

sysradium commented Feb 2, 2025

merveenoyan commented Feb 3, 2025

albertvillanova commented Feb 3, 2025 • edited Loading

sysradium commented Feb 3, 2025 • edited Loading

touseefahmed96 commented Feb 3, 2025 • edited Loading

albertvillanova left a comment

Choose a reason for hiding this comment

albertvillanova commented Feb 3, 2025

tolkienist42 commented Feb 4, 2025

SixK commented Feb 4, 2025

sysradium commented Feb 4, 2025 • edited Loading

sysradium commented Jan 28, 2025 •

edited

Loading

merveenoyan left a comment •

edited

Loading

aymeric-roucher commented Jan 29, 2025 •

edited

Loading

sysradium commented Jan 29, 2025 •

edited

Loading

RolandJAAI commented Jan 29, 2025 •

edited

Loading

albertvillanova commented Feb 3, 2025 •

edited

Loading

sysradium commented Feb 3, 2025 •

edited

Loading

touseefahmed96 commented Feb 3, 2025 •

edited

Loading

sysradium commented Feb 4, 2025 •

edited

Loading