-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically Feed Artifacts In Task Memory To LLM #1432
Comments
This is working as intended given the current implementation. You would need to use Task Memory ( I'm going to update this issue as an enhancement to make this possible without a secondary query tool. |
For future: this might be relevant to some of the Meta Memory refactors. CC @vasinov |
Oh wait...we do support this! But only some models support it. Last I checked Claude was the only one who could take an image input from a Tool. This works for me: from griptape.drivers import AnthropicPromptDriver
from griptape.structures import Agent
from griptape.tools import FileManagerTool
agent = Agent(
prompt_driver=AnthropicPromptDriver(model="claude-3-5-sonnet-20240620"),
stream=True,
tools=[FileManagerTool()],
)
agent.run("Describe this file: assets/mountain.jpg") I can look into gemini flash but this might be out of our hands. |
Neither OpenAi nor Gemini appear to support Images coming from Tools. Which means the best solution is to wire up the two steps with a Pipeline or Workflow. For instance: from griptape.structures import Pipeline
from griptape.tasks import PromptTask, ToolTask
from griptape.tools import FileManagerTool
agent = Pipeline(
tasks=[
ToolTask(tool=FileManagerTool(), id="file"),
PromptTask(lambda task: task.parent_outputs["file"]),
],
)
agent.run("Describe this file: assets/mountain.jpg") |
woah.. so.. this is how I could "chat" with it.. import os
from dotenv import load_dotenv
from griptape.drivers import GooglePromptDriver
from griptape.structures import Pipeline
from griptape.tasks import PromptTask, ToolTask
from griptape.tools import FileManagerTool
from griptape.utils import Chat
load_dotenv()
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
agent = Pipeline(
tasks=[
ToolTask(tool=FileManagerTool(), id="file"),
PromptTask(
lambda task: task.parent_outputs["file"],
prompt_driver=GooglePromptDriver(
api_key=GOOGLE_API_KEY, model="gemini-2.0-flash-exp", stream=True
),
),
],
)
Chat(agent).start() that.. actually kind of works.. I hadn't thought about chatting with a pipeline. this is .. strange.. it obviously requires that any conversation I have with it involves looking up file information. Would switching ToolTask to ToolkitTask allow for a more natural conversation? is there a better way? |
Yes, I think you might prefer this pattern. |
yeah, it's a better pattern - left a comment there about my lack of master lego builder status :) |
@shhlife can we close this issue? Sounds like there is still some work regarding making patterns more discoverable, but I'd prefer to make that a separate issue. |
That's so interesting that claude can take an image input from a tool, but openai can't.. but it does seem to be reading the image..
It looks like the output is right - it read it. |
The only thing openai saw was
More context on the topic: https://community.openai.com/t/returning-image-as-result-of-function-call-to-gpt-4-turbo/714903 |
@collindutter that context was super helpful, thank you! Yeah, let's hope OpenAI allows images in tool calls, and also close this issue & make a separate issue about discoverability. cheers! |
Describe the bug
I'm trying to use the FileManagerTool with an agent and telling it to load an image file. The image file exists, and I can load it with a ImageLoader, but I'd like to use the FileManagerTool to be a bit more flexible and give the agent the ability to load files as it needs.
When I tell it to load the file I get the following message:
The FileManagerTool does have the ability to load other types of files besides text:
To Reproduce
Expected behavior
I expect the agent to be able to load and interpret files based on the capabilities of the model.
The text was updated successfully, but these errors were encountered: