-
-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: started working on always-fresh context message #281
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 592a950 in 1 minute and 56 seconds
More details
- Looked at
56
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. gptme/logmanager.py:321
- Draft comment:
The TODO comment about includinggit status -vv
context is not yet implemented. Ensure this functionality is completed or tracked in an issue before merging. - Reason this comment was not posted:
Comment did not seem useful.
2. gptme/logmanager.py:325
- Draft comment:
The TODO comment about getting mentioned files and ensuring their latest versions are in context is not yet implemented. Ensure this functionality is completed or tracked in an issue before merging. - Reason this comment was not posted:
Marked as duplicate.
3. gptme/logmanager.py:331
- Draft comment:
The TODO comment about stripping old patches and saves is not yet implemented. Ensure this functionality is completed or tracked in an issue before merging. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_CikcFQ4meMOMBZdj
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## master #281 +/- ##
==========================================
- Coverage 74.06% 73.45% -0.61%
==========================================
Files 67 68 +1
Lines 4770 4932 +162
==========================================
+ Hits 3533 3623 +90
- Misses 1237 1309 +72
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
The more I think about this the more of an improvement I realize that it would be. I should prioritize this. |
I just realized I did something very similar in https://github.com/ErikBjare/gptme/blob/master/scripts/treeofthoughts.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on d953629 in 54 seconds
More details
- Looked at
220
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
4
drafted comments based on config settings.
1. gptme/logmanager.py:320
- Draft comment:
Handle the case where a file is mentioned but does not exist to avoid potentialKeyError
. Consider usingfiles.get(f, 0)
instead offiles[f]
. - Reason this comment was not posted:
Comment looked like it was already resolved.
2. gptme/logmanager.py:364
- Draft comment:
HandleOSError
for long file paths ininclude_file_content
to prevent potential crashes. - Reason this comment was not posted:
Comment looked like it was already resolved.
3. gptme/logmanager.py:382
- Draft comment:
Consider logging or handling unexpected values forGPTME_FRESH_CONTEXT
to avoid confusion or unexpected behavior. - Reason this comment was not posted:
Confidence changes required:50%
Theprepare_messages
function inlogmanager.py
uses a feature flag to determine the context mode. However, it does not log or handle the case where the environment variable is set to an unexpected value. This could lead to confusion or unexpected behavior.
4. gptme/chat.py:311
- Draft comment:
HandleOSError
for long file paths in_include_paths
to prevent potential crashes. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_vt1sLrQNkkTLw3cC
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Needs #262 fixed before merging |
It kinda works now (will push soon), but it gets confused and says stuff like "I notice that I was previously incorrect in claiming to see the contents of files that weren't actually shown to me." What if we should just stuff the context up front? Before user messages? |
Here's a detailed specification for the fresh context message feature: Fresh Context Message SpecificationOverviewThe fresh context message feature improves how file contents and git status are provided to the LLM by maintaining a single source of truth for the current state of relevant files and working directory. Design Goals
ImplementationContext Message Format
File Selection and Ordering
Path Resolution
Git Integration
Message Placement
Configuration
Legacy ModeWhen fresh context is disabled:
Future Improvements
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on a563e79 in 36 seconds
More details
- Looked at
485
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. gptme/context.py:135
- Draft comment:
Consider centralizing the check forGPTME_FRESH_CONTEXT
to avoid redundancy and potential inconsistencies across the codebase. - Reason this comment was not posted:
Confidence changes required:50%
Thegather_fresh_context
function ingptme/context.py
usesos.getenv
to check for theGPTME_FRESH_CONTEXT
environment variable. This is done multiple times across different files. It might be beneficial to centralize this check to avoid redundancy and potential inconsistencies.
2. gptme/chat.py:279
- Draft comment:
Consider centralizing the check forGPTME_FRESH_CONTEXT
to avoid redundancy and potential inconsistencies across the codebase. - Reason this comment was not posted:
Confidence changes required:50%
The_include_paths
function ingptme/chat.py
usesos.getenv
to check for theGPTME_FRESH_CONTEXT
environment variable. This is done multiple times across different files. It might be beneficial to centralize this check to avoid redundancy and potential inconsistencies.
3. gptme/logmanager.py:311
- Draft comment:
Thegather_fresh_context
function appears to be a duplicate of the one ingptme/context.py
. Consider removing it to avoid redundancy. - Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_1BnjrTeJZOlJChMg
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 8ae33f2 in 9 seconds
More details
- Looked at
22
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. gptme/init.py:38
- Draft comment:
Consider providing more context on why settingTOKENIZERS_PARALLELISM
tofalse
is necessary and its potential impact on performance or functionality. - Reason this comment was not posted:
Confidence changes required:50%
The comment about setting the environment variable for TOKENIZERS_PARALLELISM is not clear enough about its impact on performance and potential side effects.
Workflow ID: wflow_Mc8iy30CTWNeUd0A
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
- Add tests for context handling functionality - Fix trailing whitespace and ruff issues - Improve file path handling in context gathering - Enhance context message formatting
8ae33f2
to
7cde498
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 7cde498 in 40 seconds
More details
- Looked at
541
lines of code in4
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. tests/test_context.py:22
- Draft comment:
Avoid usingos.chdir()
in tests as it can lead to side effects. Instead, usetmp_path
directly for path operations. - Reason this comment was not posted:
Confidence changes required:50%
The use ofos.chdir()
in tests can lead to side effects, especially if tests are run in parallel. It's better to usetmp_path
directly without changing the working directory.
2. gptme/context.py:188
- Draft comment:
Remove commented-out code forgit_status
andgh_pr_status
if not needed to keep the code clean. - Reason this comment was not posted:
Confidence changes required:30%
Thegather_fresh_context
function has commented-out code for git and PR status. If these are not needed, they should be removed to keep the code clean.
3. gptme/context.py:269
- Draft comment:
Ensure consistency in environment variable names.FRESH_CONTEXT
here should matchGPTME_FRESH_CONTEXT
used elsewhere. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_WWokdvjv8kJ7Nkrv
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
gptme/context.py
Outdated
files_fresh = [ | ||
f | ||
for f in files | ||
if not check_modified or f.stat().st_mtime <= datetime.timestamp(msg.timestamp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider handling FileNotFoundError
when accessing f.stat().st_mtime
to avoid runtime errors if the file is deleted between checks.
"""Run pre-commit checks on modified files and return output if there are issues.""" | ||
cmd = "pre-commit run --files $(git ls-files -m)" | ||
try: | ||
subprocess.run(cmd, shell=True, capture_output=True, text=True, check=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using shell=True
with subprocess.run
can be a security risk. Ensure the command is safe or refactor to avoid shell=True
.
Add fresh context mode that provides accurate file state information: - Add FRESH_CONTEXT environment flag to enable new mode - Implement context gathering from git and files - Track file paths and include contents before each user message - Improve file path handling and workspace relative paths - Add pre-commit check integration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 5ca4f1c in 18 seconds
More details
- Looked at
113
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
4
drafted comments based on config settings.
1. gptme/chat.py:13
- Draft comment:
Consider centralizing theuse_fresh_context
flag initialization to avoid multiple environment variable checks. This will ensure consistency and improve performance slightly. - Reason this comment was not posted:
Confidence changes required:50%
The environment variable for fresh context mode is checked multiple times in different files. This can lead to inconsistencies if the environment variable changes during runtime. It's better to centralize this check.
2. gptme/context.py:16
- Draft comment:
Consider centralizing theuse_fresh_context
flag initialization to avoid multiple environment variable checks. This will ensure consistency and improve performance slightly. - Reason this comment was not posted:
Confidence changes required:50%
The environment variable for fresh context mode is checked multiple times in different files. This can lead to inconsistencies if the environment variable changes during runtime. It's better to centralize this check.
3. gptme/chat.py:299
- Draft comment:
Consider adding support for directories in_include_paths
as mentioned in the TODO comment. This could enhance functionality. - Reason this comment was not posted:
Confidence changes required:30%
The_include_paths
function inchat.py
has a TODO comment about supporting directories. This is a potential area for improvement, but not a critical issue.
4. gptme/chat.py:308
- Draft comment:
Consider addressing the TODO about handling codeblocks within codeblocks in_include_paths
. This could improve robustness. - Reason this comment was not posted:
Confidence changes required:30%
The_include_paths
function inchat.py
has a TODO comment about handling codeblocks within codeblocks. This is a potential area for improvement, but not a critical issue.
Workflow ID: wflow_B4kRMRhDxZPLkUYa
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 944cb51 in 14 seconds
More details
- Looked at
32
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. gptme/chat.py:282
- Draft comment:
Consider refactoring_include_paths
to avoid duplicated logic for handling paths and URLs. Additionally, ensure that_parse_prompt
checks if a file is a text file before attempting to read its contents to avoid errors with binary files. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_ThdBCSgRoOW3We4u
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
- Add file_to_display_path to handle relative/absolute paths consistently - Enhance context gathering with fresh context mode support - Improve handling of binary files in file parsing - Better workspace path handling and validation - Add pre-commit check integration - Refactor file content inclusion logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on fc54429 in 15 seconds
More details
- Looked at
103
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. gptme/context.py:276
- Draft comment:
Ensure that the placeholder text for modified files is clearly marked and distinguishable from actual file content to avoid confusion. - Reason this comment was not posted:
Confidence changes required:50%
Theappend_file_content
function incontext.py
has a potential issue with handling file paths. If a file is modified after the message timestamp, it appends a placeholder text instead of the actual content. This could lead to confusion if the placeholder is not clearly marked or if the user expects the actual content.
2. gptme/context.py:286
- Draft comment:
Ensure thatlast_user_idx
is correctly handled when there are no user messages to avoid incorrect message insertion. - Reason this comment was not posted:
Confidence changes required:50%
Theenrich_messages_with_context
function incontext.py
has a potential issue with the insertion of the fresh context message. Iflast_user_idx
isNone
, it defaults to-1
, which might not be the intended behavior if there are no user messages. This could lead to incorrect message insertion.
Workflow ID: wflow_H5kPp8dU64B6rpHk
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 5fd6b62 in 26 seconds
More details
- Looked at
35
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. gptme/context.py:70
- Draft comment:
Consider explicitly checking if a file is a text file before attempting to read it, to avoid relying on catchingUnicodeDecodeError
. This can improve clarity and performance. - Reason this comment was not posted:
Confidence changes required:50%
The current implementation ofappend_file_content
has a potential issue with handling non-text files. The function attempts to read the file content and if it fails due to aUnicodeDecodeError
, it skips the file. However, it should also handle the case where the file is not a text file by design, and not just due to an error.
2. gptme/context.py:79
- Draft comment:
The list comprehension should filter out files that are not infiles_text
instead offiles_fresh
. This ensures that only files with content are removed from the list.
files=[f for f in files if f not in files_text],
- Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_90CQP49NTl6T9miM
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on e88ca95 in 33 seconds
More details
- Looked at
244
lines of code in7
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. gptme/chat.py:340
- Draft comment:
Remove unnecessary print statement to clean up the code. - Reason this comment was not posted:
Confidence changes required:50%
The print statement on line 340 is unnecessary and should be removed to clean up the code.
Workflow ID: wflow_ExI4kn9P0QNqyrap
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on bbbbae8 in 13 seconds
More details
- Looked at
39
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. gptme/chat.py:337
- Draft comment:
Remove commented-out code to keep the codebase clean and maintainable.
- Reason this comment was not posted:
Confidence changes required:50%
The commented-out print statement should be removed to keep the code clean.
2. gptme/context.py:191
- Draft comment:
Thegit
parameter should have a default value ofTrue
for clarity and consistency.
msgs: list[Message], workspace: Path | None, git: bool = True
- Reason this comment was not posted:
Confidence changes required:50%
The function signature should have a consistent style, and the default argument should be a boolean value.
Workflow ID: wflow_vDhARiGX8pCd4vq2
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
I've had this idea for a while to gather all the dynamic context, like files and RAG results, and always output the most up-to-date versions of them right before a user message (not including previous/stale such messages).
This should save considerably on tokens (but may break caching), improve the reliability of patches, and unifies
files
handling.TODO
cd
after attaching images to message #262Even in legacy mode (a regression)TODO (to merge, not to finish)
We could also set the current date and time in such a message, instead of the system prompt (#137).
Important
Introduces fresh context message handling to ensure latest file versions and git status are included in messages, with changes in
chat.py
,context.py
, andlogmanager.py
.chat.py
by tracking paths as files instead of codeblocks.prepare_messages()
inlogmanager.py
to strip old context and ensure latest file versions are included.gather_fresh_context()
incontext.py
to collect fresh file contents and git status._include_paths()
inchat.py
to support fresh context mode.test_context.py
forfile_to_display_path()
,append_file_content()
,gather_fresh_context()
, andget_mentioned_files()
.Counter
import incontext.py
for file mention tracking.context.py
.This description was created by
for bbbbae8. It will automatically update as commits are pushed.