-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add RAG for better PR feedback attempt2 #62
Conversation
The code snippets you provided appear to be parts of a larger project that uses Flask for web development and interacts with GitHub APIs. Here's an analysis of the changes made: 1. New function in A new function called def embed_code_base(repo_id, root_dir):
# Clone repo if needed and clone to specific branch (main)
temp_dir = clone_repo_branch(installation_id, repo_full_name, "main")
collection = get_collection_for_repo_branch(repo_id)
embedding = model.encode(f"{func['source_code']}").tolist()
add_to_chromaDB(collection, ids=[repo_id], embeddings=[embedding]) This function clones the repository to a specific branch (main) and then embeds its code into ChromaDB. 2. Changes in The
def handle_pull_requests(data, installation_id):
# ...
embed_code_base(
repo_id, f"{ROOT_DIR}/src"
) # TODO: 1. add embeddings when a repository is added, ensure root dir is from the repo of rep_id and main branch code is embedded
# ... 3. New function in A new function called def add_issues_to_chroma(collection, func):
# ...
if func["function_path"] is not None:
func["function_path"] = _format_function_path(
func["function_path"], file_extensions
)
# ... This function now formats the Overall, these changes appear to be part of a larger refactoring effort that aims to improve code organization and handling in the project. The updates focus on embedding repository code into ChromaDB, cloning repositories for specific branches, and formatting file paths correctly. |
This code snippet is part of a larger application that handles webhooks from GitHub. It specifically deals with pull requests and uses a vector database (likely ChromaDB) to store and retrieve information about the functions in the codebase. Key Components:
Example Usage:
Todos:
This approach allows the application to efficiently store and retrieve information about the codebase, enabling features like code search, similarity detection, and more. |
closes #55
This PR improves the PR feedback functionality. Previously, only the PR title, description, and code diff is used for review.
Now, relevant function code from the codebase is considered in conjunction with the previous feed data using RAG and vectorDB.
Workflow: