Refactor prompts #502

aymeric-roucher · 2025-02-05T14:29:40Z

No description provided.

aymeric-roucher · 2025-02-05T17:45:02Z

@albertvillanova do you have pointers about that failed test? It's linked to the MagicMock, I don't really have experience with these.
https://github.com/huggingface/smolagents/actions/runs/13163295123/job/36737183697?pr=502

albertvillanova · 2025-02-05T17:49:14Z

Let me have a look! 😉

albertvillanova

This will fix the MagicMock issue in the test.

tests/test_agents.py

albertvillanova

However there is another issue:

MultiStepAgent does not implement initialize_system_prompt
Therefore: self.system_prompt = None
self.system_prompt.strip() will raise an error

AttributeError: 'NoneType' object has no attribute 'strip'

tests/test_agents.py

aymeric-roucher · 2025-02-05T20:14:51Z

Oh and btw @albertvillanova explanations about the removal of single_step option: this option was only introduced for a benchmark, not really useful because single-step agents struggle to solve real tasks, and it introduced lots of overhead with a dedicated prompt: thus the removal!

Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>

aymeric-roucher · 2025-02-05T20:19:21Z

@albertvillanova regarding you command above on system_prompt being None: I've removed the .strip() from the incriminated line, WDYT?

albertvillanova · 2025-02-06T08:05:00Z

@albertvillanova regarding you command above on system_prompt being None: I've removed the .strip() from the incriminated line, WDYT?

Does it make sense to return "text": None?

[Message(role=MessageRole.SYSTEM, content=[{"type": "text", "text": None}])]

albertvillanova

To fix the failing test.

tests/test_agents.py

albertvillanova

You removed managed_agent_prompt.

src/smolagents/agents.py

Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>

aymeric-roucher · 2025-02-06T17:09:15Z

tests/test_agents.py

+        def check_always_fails(final_answer, agent_memory):
+            assert False, "Error raised in check"
+
+        agent = CodeAgent(model=fake_code_model, tools=[], final_answer_checks=[check_always_fails])


@albertvillanova I've created this new logic with final_answer_checks to force the agent to keep running while the validation does not pass. It's quite handy for some validation logic like "output should be a list" or "let a LLM verify your output according to a pre-defined validation prompt and parse this LLM's output to find if the test passes".

Example of such a check for a task that involves making a plot:

def check_reasoning_and_plot(final_answer, agent_memory): final_answer multimodal_model = OpenAIServerModel("gpt-4o") filepath = "saved_map.png" assert os.path.exists(filepath), "Make sure to save the plot under saved_map.png!" image = Image.open(filepath) prompt = ( f"Here is a user-given task and the agent steps: {agent_memory.get_succinct_steps()}. Now here is the plot that was made." "Please check that the reasoning process and plot are correct: do they correctly answer the given task?" "First list 3 reasons why yes/no, then write your final decision: PASS in caps lock if it is satisfactory, FAIL if it is not." "Don't be harsh: if the plot mostly solves the task, it should pass." "But if any data was hallucinated/invented, you should refuse it." ) messages = [ { "role": "user", "content": [ { "type": "text", "text": prompt, }, {"type": "image_url", "image_url": {"url": make_image_url(encode_image_base64(image))}}, ], } ] output = multimodal_model(messages).content print("Feedback: ", output) if "FAIL" in output: raise Exception(output) return True

albertvillanova

Thanks! Let's start sharing agents!

aymeric-roucher added 3 commits February 5, 2025 15:26

Start prompt offshoring to yaml

b8b1b52

Merge branch 'main' into share-prompts

355c344

Reformat prompts

d133e19

aymeric-roucher changed the title ~~Share prompts~~ Refactor prompts Feb 5, 2025

aymeric-roucher added 4 commits February 5, 2025 17:38

Reset commenting on example

3374ffd

Add toolcalling agent prompt

aad1900

Remove prompts.py

661b9fa

Fix test

a179cdb

albertvillanova reviewed Feb 5, 2025

View reviewed changes

tests/test_agents.py Show resolved Hide resolved

albertvillanova reviewed Feb 5, 2025

View reviewed changes

tests/test_agents.py Show resolved Hide resolved

albertvillanova reviewed Feb 5, 2025

View reviewed changes

tests/test_agents.py Outdated Show resolved Hide resolved

aymeric-roucher and others added 3 commits February 5, 2025 21:15

Update tests/test_agents.py

531b725

Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>

Update tests/test_agents.py

1eb88ef

Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>

Remove system_prompt.strip()

8806ac4

albertvillanova reviewed Feb 6, 2025

View reviewed changes

tests/test_agents.py Outdated Show resolved Hide resolved

tests/test_agents.py Outdated Show resolved Hide resolved

albertvillanova reviewed Feb 6, 2025

View reviewed changes

src/smolagents/agents.py Outdated Show resolved Hide resolved

src/smolagents/agents.py Outdated Show resolved Hide resolved

aymeric-roucher and others added 5 commits February 6, 2025 14:01

Improve visualization

5e8e381

Add test for visualization and final answer checks

17e13b9

Update tests/test_agents.py

dff130d

Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>

Update tests/test_agents.py

62a3859

Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>

Pass tests

fe37584

aymeric-roucher commented Feb 6, 2025

View reviewed changes

albertvillanova approved these changes Feb 6, 2025

View reviewed changes

aymeric-roucher merged commit 8ba036b into main Feb 6, 2025
4 checks passed

g-eoj mentioned this pull request Feb 7, 2025

Fix build issues after refactor prompts #527

Closed

albertvillanova mentioned this pull request Feb 7, 2025

Fix installation with data files #536

Merged

aymeric-roucher mentioned this pull request Feb 7, 2025

Implementing custom prompts for the appropriate support of Chain of Thoughts (CoT) #84

Closed

This was referenced Feb 7, 2025

Delete prompts_path argument and use prompt_templates #541

Merged

[BUG] MultiStepAgent planning_step first step prompts just contains system prompt #575

Closed

crumbly-blue-cheese mentioned this pull request Feb 11, 2025

Removed reference to deprecated single_step in documentation #608

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor prompts #502

Refactor prompts #502

aymeric-roucher commented Feb 5, 2025

aymeric-roucher commented Feb 5, 2025

albertvillanova commented Feb 5, 2025

albertvillanova left a comment

albertvillanova left a comment

aymeric-roucher commented Feb 5, 2025

aymeric-roucher commented Feb 5, 2025

albertvillanova commented Feb 6, 2025 •

edited

Loading

albertvillanova left a comment

albertvillanova left a comment

aymeric-roucher Feb 6, 2025 •

edited

Loading

aymeric-roucher Feb 6, 2025

albertvillanova left a comment

Refactor prompts #502

Refactor prompts #502

Conversation

aymeric-roucher commented Feb 5, 2025

aymeric-roucher commented Feb 5, 2025

albertvillanova commented Feb 5, 2025

albertvillanova left a comment

Choose a reason for hiding this comment

albertvillanova left a comment

Choose a reason for hiding this comment

aymeric-roucher commented Feb 5, 2025

aymeric-roucher commented Feb 5, 2025

albertvillanova commented Feb 6, 2025 • edited Loading

albertvillanova left a comment

Choose a reason for hiding this comment

albertvillanova left a comment

Choose a reason for hiding this comment

aymeric-roucher Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

aymeric-roucher Feb 6, 2025

Choose a reason for hiding this comment

albertvillanova left a comment

Choose a reason for hiding this comment

albertvillanova commented Feb 6, 2025 •

edited

Loading

aymeric-roucher Feb 6, 2025 •

edited

Loading