-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor prompts #502
Refactor prompts #502
Conversation
@albertvillanova do you have pointers about that failed test? It's linked to the MagicMock, I don't really have experience with these. |
Let me have a look! 😉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fix the MagicMock issue in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However there is another issue:
- MultiStepAgent does not implement initialize_system_prompt
- Therefore: self.system_prompt = None
- self.system_prompt.strip() will raise an error
AttributeError: 'NoneType' object has no attribute 'strip'
Oh and btw @albertvillanova explanations about the removal of |
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
@albertvillanova regarding you command above on |
Does it make sense to return
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fix the failing test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You removed managed_agent_prompt
.
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
def check_always_fails(final_answer, agent_memory): | ||
assert False, "Error raised in check" | ||
|
||
agent = CodeAgent(model=fake_code_model, tools=[], final_answer_checks=[check_always_fails]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albertvillanova I've created this new logic with final_answer_checks
to force the agent to keep running while the validation does not pass. It's quite handy for some validation logic like "output should be a list" or "let a LLM verify your output according to a pre-defined validation prompt and parse this LLM's output to find if the test passes".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example of such a check for a task that involves making a plot:
def check_reasoning_and_plot(final_answer, agent_memory):
final_answer
multimodal_model = OpenAIServerModel("gpt-4o")
filepath = "saved_map.png"
assert os.path.exists(filepath), "Make sure to save the plot under saved_map.png!"
image = Image.open(filepath)
prompt = (
f"Here is a user-given task and the agent steps: {agent_memory.get_succinct_steps()}. Now here is the plot that was made."
"Please check that the reasoning process and plot are correct: do they correctly answer the given task?"
"First list 3 reasons why yes/no, then write your final decision: PASS in caps lock if it is satisfactory, FAIL if it is not."
"Don't be harsh: if the plot mostly solves the task, it should pass."
"But if any data was hallucinated/invented, you should refuse it."
)
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": prompt,
},
{"type": "image_url", "image_url": {"url": make_image_url(encode_image_base64(image))}},
],
}
]
output = multimodal_model(messages).content
print("Feedback: ", output)
if "FAIL" in output:
raise Exception(output)
return True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Let's start sharing agents!
No description provided.