feat: add streaming tool use #1884

lsorber · 2024-12-25T23:17:58Z

This PR upgrades the chatml-function-calling chat handler with support for streaming tool use and fixes #1883, #1869, and #1756, among other improvements.

Changes:

General:
a. ✨ If no system message is supplied, add an empty system message to hold the tool metadata.
b. ✨ Add function descriptions to the system message so that tool use is better informed (fixes chatml-function-callling not adding tool description to the prompt. #1869).
c. ✨ Replace print statements relating to JSON grammars with RuntimeWarning warnings.
d. ✅ Add tests with fairly broad coverage of the different scenarios.
Case "Tool choice by user":
a. ✨ Add support for more than one function call by making this a special case of "Automatic tool choice" with a single tool (subsumes Support parallel function calls with tool_choice #1503).
Case "Automatic tool choice -> respond with a message":
a. ✨ Use user-defined stop and max_tokens.
b. 🐛 Replace incorrect use of follow-up grammar with user-defined grammar.
Case "Automatic tool choice -> one or more function calls":
a. ✨ Add support for streaming the function calls (fixes Feature request: add support for streaming tool use #1883).
b. ✨ Make tool calling more robust by giving the LLM an explicit way to terminate the tool calls by wrapping them in a <function_calls></function_calls> block.
c. 🐛 Add missing ":" stop token to determine whether to continue with another tool call, which prevented parallel function calling (fixes chatml-function-calling chat format fails to generate multi calls to the same tool #1756).
d. ✨ Set temperature=0 to determine whether to continue with another tool call, similar to the initial decision on whether to call a tool.

lsorber · 2024-12-26T20:34:42Z

@abetlen The tests all pass, but the macOS ones were terminated after a timeout. I think this is because of a lack of CPU and or memory resources because the tests run fine on my macOS machine.

SubatomicPlanets · 2025-01-04T00:28:30Z

I would love to see this merged! Actually there are quite a lot of good pull requests here that i would like to see merged... But this one is top priority!

lsorber · 2025-01-05T14:53:51Z

Update: I rebased on the latest main and included a few tiny improvements to further improve tool calling robustness.

lsorber · 2025-01-12T15:07:04Z

Update: I rebased on the latest main and conditionally skipped the added tests on macOS when not enough resources are available to run them.

LenBanana · 2025-01-31T13:32:50Z

Worked well for me, would you mind rebasing to the latest commit to allow for tool streaming with Qwen models?
Thanks for your work!

lsorber mentioned this pull request Dec 26, 2024

Support parallel function calls with tool_choice #1503

Open

lsorber mentioned this pull request Dec 26, 2024

feat: add streaming tool use to llama-cpp-python superlinear-ai/raglite#71

Merged

lsorber force-pushed the main branch from d50770b to c9d6092 Compare January 5, 2025 14:51

lsorber force-pushed the main branch from c9d6092 to 9b0d8dd Compare January 6, 2025 18:58

lsorber added 3 commits January 12, 2025 15:15

feat: add streaming tool use

75c3880

fix: remove strict=True to support Python 3.9

f963561

feat: improve tool use robustness

563661f

lsorber force-pushed the main branch 3 times, most recently from 2506581 to b4f8fde Compare January 12, 2025 14:43

test: skip if insufficient resources on macOS

17301de

lsorber force-pushed the main branch from b4f8fde to 17301de Compare January 12, 2025 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add streaming tool use #1884

feat: add streaming tool use #1884

lsorber commented Dec 25, 2024 •

edited

Loading

lsorber commented Dec 26, 2024

SubatomicPlanets commented Jan 4, 2025

lsorber commented Jan 5, 2025

lsorber commented Jan 12, 2025

LenBanana commented Jan 31, 2025

feat: add streaming tool use #1884

Are you sure you want to change the base?

feat: add streaming tool use #1884

Conversation

lsorber commented Dec 25, 2024 • edited Loading

lsorber commented Dec 26, 2024

SubatomicPlanets commented Jan 4, 2025

lsorber commented Jan 5, 2025

lsorber commented Jan 12, 2025

LenBanana commented Jan 31, 2025

lsorber commented Dec 25, 2024 •

edited

Loading