Skip to content

Latest commit

 

History

History
978 lines (843 loc) · 58.7 KB

README.md

File metadata and controls

978 lines (843 loc) · 58.7 KB

Logo

AI-Researcher: Fully-Automated Scientific Discovery with LLM Agents

Project Page Join our Slack community Join our Discord community
Check out the documentation Paper Datasets

Welcome to AI-Researcher🤗 AI-Researcher introduces a revolutionary breakthrough in Automated Scientific Discovery🔬, presenting a new system that fundamentally Reshapes the Traditional Research Paradigm. This state-of-the-art platform empowers researchers with:

  • 🎯 Full Autonomy: Complete end-to-end research automation
  • 🔄 Seamless Orchestration: From concept to publication
  • 🧠 Advanced AI Integration: Powered by cutting-edge AI agents
  • 🚀 Research Acceleration: Streamlined scientific innovation

✨ The AI-Researcher system accepts user input queries at two distinct levels ✨

Level 1: Detailed Idea Description
At this level, users provide comprehensive descriptions of their specific research ideas. The system processes these detailed inputs to develop implementation strategies based on the user's explicit requirements.

Level 2: Reference-Based Ideation
This simpler level involves users submitting reference papers without a specific idea in mind. The user query typically follows the format: "I have some reference papers, please come up with an innovative idea and implement it with these papers." The system then analyzes the provided references to generate and develop novel research concepts.


🌟Core Capabilities & Integration
AI-Researcher delivers a Comprehensive Research Ecosystem through seamless integration of critical components:

🚀Primary Research Functions

  • 📚 Literature Review: Conducts comprehensive analysis and synthesis of existing research.
  • 📊 Idea Generation: Systematically gathers, organizes, and formulates novel research directions.
  • 🧪 Algorithm Design and Implementation: Develops methodologies and transforms ideas into functional implementations.
  • 💻 Algorithm Validation and Refinement: Automates testing, performance evaluation, and iterative optimization.
  • 📈 Result Analysis: Delivers advanced interpretation of experimental data and insights.
  • ✍️ Manuscript Creation: Automatically generates polished, full-length academic papers.
Logo
Quick Overview of AI-Researcher.

🔥 News

  • [2025, Mar 04]:  🎉🎉We've launched AI-Researcher!, The release includes the complete framework, datasets, benchmark construction pipeline, and much more. Stay tuned—there’s plenty more to come! 🚀

📑 Table of Contents

⚡ Quick Start

Installation

AutoAgent Installation

git clone https://github.com/HKUDS/AI-Researcher.git
cd AI-Researcher
pip install -e .

Docker Installation

To set up the agent-interactive environment, we use Docker for containerization. Please ensure you have Docker installed on your system before proceeding. For running the research agent, we utilize the Docker image 'tjbtech1/paperagent:latest'. You can pull this image by executing the following command:

docker pull tjbtech1/paperagent:latest

API Keys Setup

Create an environment variable file based on the provided '.env.template' file. In this file, set the API keys for the LLMs you intend to use. Note that not all LLM API keys are mandatory—simply include the ones relevant to your needs.

OPENAI_API_KEY=
DEEPSEEK_API_KEY=
ANTHROPIC_API_KEY=
GEMINI_API_KEY=
HUGGINGFACE_API_KEY=
GROQ_API_KEY=
XAI_API_KEY=

⬇️ Examples

⚠️ ALERT: The GIFs below are large files and may take some time to load. Please be patient while they render completely.

Example 1 (Vector Quantized)

Input:Prompt

I have some reference papers, please implement the following idea with these papers:

  1. The proposed model designed in this paper is designed to improve the performance of Vector Quantized Variational AutoEncoders (VQ-VAEs) by addressing issues with gradient propagation through the non-differentiable vector quantization layer.
  2. ...
  1. The core methodologies utilized include:
    • Rotation and Rescaling Transformation: A linear transformation that alters the encoder output to align it with the nearest codebook vector without changing the forward pass output.
    • Gradient Propagation Method: The proposed model ensures that gradients flow from the decoder to the encoder while preserving the angle between the gradient and codebook vector.
    • Codebook Management: Utilizes the connection between the encoder output and the corresponding codebook vectors to mitigate codebook collapse and improve utilization.
  2. The primary functions of these components are:
    • The rotation and rescaling transformation modifies how the encoder output is quantized and how information is retained during backpropagation, enabling gradients to reflect the true positioning of the encoder output relative to the codebook vectors.
    • The gradient propagation method redefines how gradients are transported back to the encoder, allowing for an enhanced and nuanced movement through the quantization layer, which leads to a better performance during training.
    • Codebook management practices help in maintaining a diverse set of codebook vectors throughout training, avoiding scenarios where multiple vectors become redundant or unused.
  3. Implementation details for each component:
    • Key Parameters:
      • Codebook size should be configured based on the complexity of the dataset (e.g., 1024 or 8192).
      • Commitment loss coefficient (β) is typically set within [0.25, 2].
    • Input/Output Specifications:
      • Input to the encoder is a continuous high-dimensional vector, while the output is a corresponding quantized vector from the codebook.
      • The output for reconstruction is generated using the decoder applied to the transformed codebook vectors.
    • Important Constraints:
      • Ensure that the codebook is updated correctly with an exponential moving average procedure, and treat both rotation and rescaling during the forward pass as constants with respect to the gradient.
  4. Step-by-Step Integration of Components:
    • Step 1: Input the data vector into the encoder to obtain the continuous representation.
    • Step 2: Identify the nearest codebook vector to the encoder output.
    • Step 3: Compute the rotation matrix that aligns the encoder output to the codebook vector.
    • Step 4: Apply the rotation and rescaling transformation to obtain the modified output for the decoder (i.e., `˜ q`).
    • Step 5: Feed `˜ q` into the decoder to produce the reconstructed output.
    • Step 6: Compute the loss using the reconstruction and apply backpropagation.
    • Step 7: During backpropagation, modify the gradient transfer process to maintain the angle using the proposed model, replacing traditional shortcuts in gradient computation.
  5. Critical implementation details affecting performance:
    • The choice of rotation matrix calculation should ensure computational efficiency—using Householder transformations to minimize resource demands.
    • The deployment of the stop-gradient technique effectively turns off the back-propagation through the quantization layer, which is essential to reflect the intended change without inducing undesired noise in the gradient updates.
    • Monitor the codebook usage regularly during training to detect any potential collapse early and adjust the training dynamics (e.g., learning rate) accordingly to maintain effective utilization throughout the training period.
Input:Reference Papers
  1. Neural discrete representation learning
  2. ...
  1. Straightening out the straight-through estimator: Overcoming optimization challenges in vector quantized networks
  2. Estimating or propagating gradients through stochastic neurons for conditional computation
  3. High-resolution image synthesis with latent diffusion models
  4. Finite scalar quantization: Vq-vae made simple
  5. Elements of information theory
  6. Vector-quantized image modeling with improved vqgan
  7. Uvim: A unified modeling approach for vision with learned guiding codes
  8. Auto-encoding variational bayes
  9. Categorical reparameterization with gumbel-softmax
PDF Document
Self-Organized Paper (fully-generated by AI-Researcher, click to view).
profiles
Self-Organized Workplace, take time to load (fully-generated by AI-Researcher, click to view).

Example 2 (Category: Vector Quantized)

Input:Prompt

I have some reference papers, please implement the following idea with these papers:

  1. The proposed model focuses on discrete representation learning for tasks such as image generation, depth estimation, colorization, and segmentation using the proposed approach integrated into architectures like autoregressive transformers.
  2. ...
  1. Core Techniques:
    • Simplified Quantization: Use a simplified quantization approach utilizing scalar quantization instead of VQ.
    • Dimensionality Projection: Define a function to project the encoder output to a manageable dimensionality (typically between 3 to 10).
    • Gradient Propagation: Implement the Straight-Through Estimator (STE) for gradient propagation through the quantization operation.
  2. Technical Components:
    • Bounding Function: This compresses data dimensionality and confines values to a desired range. Use a function like \(f(z) = \left\lfloor \frac{L}{2} \right\rfloor \tanh(z)\) to project the data, where \(L\) is the number of quantization levels.
    • Quantization process: Round each bounded dimension to its nearest integer to yield the quantized output.
    • Loss function: Operate under a reconstruction loss paradigm typical in VAEs to optimize the proposed model parameters.
  3. Implementation Details:
    • Key Parameters:
      • Number of dimensions \(d\) and levels \(L\) per dimension should be defined based on the codebook size you aim to replicate (e.g., set \(L_i \geq 5\) for all \(i\)).
    • Input/Output Specifications:
      • The input to the bounding function will be the output from the final encoder layer; the output after quantization will be in the format \(\hat{z}\), with shape matching the original \(z\).
    • Constraints:
      • Ensure all inputs are preprocessed adequately to be within the functioning range of the bounding function.
  4. Step-by-Step Integration:
    • Step 1: Train a standard VAE model and obtain its encoder output \(z\).
    • Step 2: Apply the bounding function \(f\) on \(z\) to limit the output dimensions to usable values.
    • Step 3: Quantize the resultant bounded \(z\) using the rounding procedure to generate \( \hat{z} \).
    • Step 4: Use the original \(z\) and \(\hat{z}\) in conjunction with the reconstruction loss to backpropagate through the network using the STE for gradient calculation.
  5. Critical Implementation Details:
    • Ensure the rounding process is correctly differentiable; utilize the STE to maintain gradient flow during backpropagation.
    • Maintain high codebook utilization by selecting optimal dimensions and levels based on empirical trials, and monitor performance to refine the parameters if needed.
    • Adjust the proposed model configurations (number of epochs, batch size) based on the structures laid out in this paper, ensuring hyperparameters match those recommended for the proposed approach integration.
Input:Reference Papers
  1. Neural discrete representation learning
  2. ...
  1. Conditional probability models for deep image compression
  2. High-fidelity generative image compression
  3. End-to-end optimized image compression
  4. Taming transformers for high-resolution image generation
  5. An algorithm for vector quantizer design
  6. Joint autoregressive and hierarchical priors for learned image compression
  7. Assessing generative models via precision and recall
  8. Variational bayes on discrete representation with self-annealed stochastic quantization
  9. High quality monocular depth estimation via transfer learning
PDF Document
Self-Organized Paper (fully-generated by AI-Researcher, click to view).
profiles
Self-Organized Workplace, take time to load (fully-generated by AI-Researcher, click to view).

Example 3 (Category: Recommendation)

Input:Prompt

I have some reference papers, please implement the following idea with these papers:

  1. The proposed model aims to improve user-item interaction predictions in recommendation systems by leveraging heterogeneous relational information.
  2. ...
  1. Core Techniques/Algorithms:
    • Heterogeneous Graph Neural Networks (GNNs): Used for embedding initialization and message propagation across different types of user-item and user-user/item-item graphs.
    • Contrastive Learning: Specifically, a cross-view contrastive learning framework is utilized to enhance representation learning by aligning embeddings from auxiliary views with user-item interaction embeddings.
    • Meta Networks: Employed to extract personalized knowledge and facilitate customized knowledge transfer between auxiliary views and the user-item interaction view.
  2. Purpose and Function of Each Major Component:
    • Heterogeneous GNN: Encodes user and item relationships into embeddings that capture the semantics of various interactions.
    • Contrastive Learning: Provides self-supervision signals to enhance the robustness of learned representations, allowing the proposed model to distinguish between relevant and irrelevant interactions.
    • Meta Network: Models personalized characteristics to facilitate adaptive knowledge transfer, ensuring that the influence of auxiliary information is tailored to individual users and items.
  3. Implementation Details:
    • Heterogeneous GNN:
      • Key Parameters: Use Xavier initializer for embedding initialization; set the hidden dimensionality d.
      • Input/Output: Take adjacency matrices for user-item, user-user, and item-item graphs as input; output relation-aware embeddings.
      • Constraints: Ensure that the GNN can handle varying types of nodes and relations.
    • Contrastive Learning:
      • Key Parameters: Use cosine similarity as the similarity function; define a temperature coefficient for handling negative samples.
      • Input/Output: Input embeddings from the meta network and user/item views; output contrastive loss values.
      • Constraints: Maintain diverse representations to avoid overfitting.
    • Meta Network:
      • Key Parameters: Set up fully connected layers with PReLU activation to generate personalized transformation matrices.
      • Input/Output: Input user and item embeddings; output transformed embeddings for personalized knowledge transfer.
      • Constraints: Ensure low-rank decomposition of transformation matrices to reduce parameter count.
  4. Step-by-Step Interaction:
    • Initialize user and item embeddings using a heterogeneous GNN.
    • Perform heterogeneous message propagation to refine embeddings iteratively across user-item, user-user, and item-item graphs.
    • Aggregate the refined embeddings from various views using a mean pooling function to retain heterogeneous semantics.
    • Extract meta knowledge from the learned embeddings to create personalized mapping functions using the meta network.
    • Apply contrastive learning to align embeddings from auxiliary views with the user-item interaction embeddings, generating a contrastive loss.
    • Combine the contrastive loss with a pairwise loss function (like Bayesian Personalized Ranking) to optimize the proposed model.
  5. Critical Implementation Details:
    • Choose appropriate hyperparameters such as embedding size, learning rate, and the number of GNN layers through systematic experimentation.
    • Monitor the proposed model for signs of overfitting, especially when increasing the number of GNN layers or embedding dimensions.
    • Ensure diverse user-item interaction patterns are captured through sufficient training data and effective augmentation techniques.
Input:Reference Papers
  1. Revisiting Graph Based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach
  2. ...
  1. Graph Neural Networks for Social Recommendation
  2. Improving Graph Collaborative Filtering with Neighborhood-enriched Contrastive Learning
  3. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation
  4. Knowledge-aware Coupled Graph Neural Network for Social Recommendation
  5. Heterogeneous Graph Transformer
  6. Sequential Recommendation with Graph Neural Networks
PDF Document
Self-Organized Paper (fully-generated by AI-Researcher, click to view).
profiles
Self-Organized Workplace, take time to load (fully-generated by AI-Researcher, click to view).

Example 4 (Category: Recommendation)

Input:Prompt

I have some reference papers, please implement the following idea with these papers:

  1. The proposed model focuses on collaborative filtering for recommendation systems by leveraging graph neural networks (GNNs) and contrastive learning to address the issue of sparse user-item interactions.
  2. ...
  1. Core Techniques:
    • Graph Neural Networks: Utilize GNNs for message passing to learn user and item embeddings from the interaction graph.
    • Disentangled Representations: Implement a mechanism to model multiple latent intent factors driving user-item interactions.
    • Contrastive Learning: Use contrastive learning techniques to generate adaptive self-supervised signals from augmented views of user-item interactions.
  2. Purpose of Components:
    • GNN Layers: Capture high-order interactions among users and items through iterative message passing.
    • Intent Encoding: Differentiate latent intents to improve the representation of user preferences.
    • Adaptive Augmentation: Generate contrastive views that account for both local and global dependencies to enhance robustness against noise.
  3. Implementation Details:
    • Graph Construction:
      • Input: User-item interaction matrix \( A \) of size \( I \times J \) (where \( I \) is the number of users and \( J \) is the number of items).
      • Output: Normalized adjacency matrix \( \bar{A} \).
    • GNN Configuration:
      • Number of layers \( L \): Choose based on your dataset, typically 2 or 3 layers.
      • Dimensionality \( d \) of embeddings: Start with \( d = 32 \).
    • Intent Prototypes:
      • Number of intents \( K \): Experiment with values from {32, 64, 128, 256}, starting with \( K = 128 \).
    • Learning Rate: Use Adam optimizer with a learning rate around \( 1e-3 \).
    • Loss Functions:
      • Use Bayesian Personalized Ranking (BPR) loss for the recommendation task.
      • Implement InfoNCE loss for contrastive learning, incorporating both local and global augmented views.
  4. Step-by-Step Interaction:
    • Construct the interaction graph from the user-item matrix.
    • For each GNN layer:
      • Compute the aggregated embeddings \( Z(u) \) and \( Z(v) \) using the normalized adjacency matrix.
      • Update user and item embeddings using residual connections to prevent over-smoothing.
    • Generate intent-aware representations by aggregating embeddings over the latent intents.
    • Apply the learned parameterized masks for adaptive augmentation during message passing to create multiple contrastive views.
    • Calculate contrastive learning signals using the generated augmented representations and optimize using the combined loss function.
  5. Critical Implementation Details:
    • Ensure that the augmentation matrices are learned adaptively based on the current user-item embeddings to differentiate the importance of interactions.
    • Monitor the performance with different numbers of latent intents \( K \) to find an optimal balance between expressiveness and noise.
    • Regularly assess the proposed model for over-smoothing by checking the Mean Average Distance (MAD) metric on the embeddings.
    • Tune hyperparameters \( \lambda_1, \lambda_2, \lambda_3 \) for the multi-task loss to balance the contribution of the self-supervised learning signals.
Input:Reference Papers
  1. Lightgcn: Simplifying and powering graph convolution network for recommendation
  2. ...
  1. Neural collaborative filtering
  2. Disentangled contrastive learning on graphs
  3. Improving Graph Collaborative Filtering with Neighborhood-enriched Contrastive Learning
  4. Curriculum Disentangled Recommendation with Noisy Multi-feedback
  5. Disentangled heterogeneous graph attention network for recommendation
  6. Learning intents behind interactions with knowledge graph for recommendation
  7. LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation
  8. Self-supervised graph learning for recommendation
PDF Document
Self-Organized Paper (fully-generated by AI-Researcher, click to view).
profiles
Self-Organized Workplace, take time to load (fully-generated by AI-Researcher, click to view).

Example 5 (Category: Diffusion and Flow Matching)

Input:Prompt

I have some reference papers, please implement the following idea with these papers:

  1. The proposed model presented in this paper focuses on the task of generative modeling through the framework of Continuous Normalizing Flows (CNFs) to define straight flows between noise and data samples.
  2. ...
  1. Architecture:
    • Implement a neural network to parameterize the velocity field \( v_{\theta}(t, x) \) that maps from noise to data distributions.
    • Use architectures suitable for continuous functions, such as feedforward or convolutional networks.
    • Each layer should have non-linear activation functions (e.g., ReLU, Tanh).
  2. Loss Functions:
    • Velocity Consistency Loss: This should be structured as: \[ L_{\theta} = E_{t \sim U} E_{x_t, x_{t+\Delta t}} \| f_{\theta}(t, x_t) - f_{\theta}(t+\Delta t, x_{t+\Delta t}) \|^2_2 + \alpha \| v_{\theta}(t, x_t) - v_{\theta}(t+\Delta t, x_{t+\Delta t}) \|^2_2 \] where \( f_{\theta}(t, x_t) = x_t + (1 - t) v_{\theta}(t, x_t) \). Choose \( \alpha \) based on cross-validation performance.
  3. Training Procedure:
    • Sample \( x_0 \) from the noise distribution \( p_0 \).
    • For multiple time segments, define intervals and compute velocity fields iteratively.
    • Use the weights of the proposed approach in an exponential moving average to stabilize training.
  4. Sampling Process:
    • For single-step or multi-step generation, heuristically sample from the noise distribution and use the learned velocity field as follows: \[ x_{i/k} = x_{(i-1)/k} + \frac{1}{k} v_{i\theta}((i-1)/k, x_{(i-1)/k}) \]
    • Apply the Euler method for iterative updates: \[ x_{t + \Delta t} = x_t + \Delta t v_i(t, x_t) \] where \( t \in [i/k, (i + 1)/k - \Delta t] \).
  5. Key Implementation Details:
    • Ensure the network is equipped with a suitable optimizer such as Adam with a learning rate around \( 2 \times 10^{-4} \).
    • The batch size should be appropriately set (e.g., 512 for CIFAR-10).
    • Employ an ODE solver, suggested as Euler's method, during the training and sampling processes.
    • Maintain a uniform distribution for sampling time intervals \( U \).
  6. Performance Considerations:
    • Monitor convergence rates and empirically validate parameter configurations through experiments. Start with fewer segments and gradually increase to capture complex distributions better.
    • Adjust the decay rate for the EMA based on the stability of convergence (commonly around 0.999).
    • Analyze the trade-offs between sampling efficiency and sample quality, ensuring a balance during proposed model development.
Input:Reference Papers
  1. Flow matching for generative modeling
  2. ...
  1. Consistency models
  2. Rectified Flow
  3. Denoising diffusion probabilistic models
  4. Optimal flow matching: Learning straight trajectories in just one step
  5. Maximum likelihood training of score-based diffusion models
  6. Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
PDF Document
Self-Organized Paper (fully-generated by AI-Researcher, click to view).
profiles
Self-Organized Workplace, take time to load (fully-generated by AI-Researcher, click to view).

Example 6 (Category: Graph Neural Networks)

Input:Prompt

I have some reference papers, please implement the following idea with these papers:

  1. The proposed model focuses on the task of node classification in large graphs, addressing challenges like scalability, heterophily, long-range dependencies, and the absence of edges.
  2. ...
  1. The core techniques used in this study include a kernelized Gumbel-Softmax operator for all-pair message passing, which reduces computational complexity to linear (O(N)), and a Transformer-style network architecture designed for layer-wise learning of latent graph structures.
  2. The purpose of the kernelized Gumbel-Softmax operator is to enable differentiable learning of discrete graph structures by approximating categorical distributions. The Transformer-style architecture facilitates information propagation between arbitrary pairs of nodes through learned latent graphs.
  3. Implementation details for each component:
    • Kernelized Gumbel-Softmax Operator: Set the temperature parameter (τ) to a range typically between 0.25 and 0.4 for training. It operates on node feature representations (D-dimensional feature vectors). The output of this operator is a distribution over node connections, facilitating the selection of neighbors for message passing.
    • Node Feature Input: Each node input should be represented as a feature vector (e.g., {x_u} ∈ R^D), and the output is an updated representation of the node embedding after message passing.
    • Relational Bias (if applicable): Introduces activation (e.g., sigmoid) to adjust the message passing weights based on an observed adjacency matrix, which enhances weight assignment for connected nodes.
    • Edge Regularization Loss: Combines categorical edge probabilities with a supervised classification loss, encouraging the network to maintain predicted edges consistent with observed edges.
  4. The step-by-step interaction of these components includes:
    • Begin with an input matrix of node embeddings (X) and, if available, an adjacency matrix (A).
    • Apply the kernelized Gumbel-Softmax operator to the embedding matrix to generate a probability distribution over neighbor selection for each node.
    • Use these probabilities to sample neighbors, allowing for message passing where each node aggregates information from its selected neighbors.
    • Update the node embeddings using an attention mechanism, which can be enhanced by relational bias if edges are available.
    • After K iterations of neighbor sampling, apply loss functions comprising a supervised classification loss and, if applicable, edge-level regularization loss to optimize the embedding representations.
  5. Critical implementation details affecting performance involve:
    • Careful tuning of the temperature parameter (τ) in the Gumbel-Softmax operator, as it significantly influences the proposed approach's capacity to capture the discrete nature of graph structures.
    • Utilizing appropriate batch sizes for large-scale graphs, ensuring enough memory is available while also maintaining computational efficiency.
    • Choosing the correct dimensionality for random features in the kernel approximation, balancing model expressiveness and training stability.
    • The use of dropout or other regularization techniques such as edge-level regularization can influence the proposed model's generalization capabilities on unseen data.
Input:Reference Papers
  1. On the bottleneck of graph neural networks and its practical implications
  2. ...
  1. Semi-supervised classification with graph convolutional networks
  2. Categorical reparameterization with gumbel-softmax
  3. Learning discrete structures for graph neural networks
  4. Mixhop: Higher-order graph convolutional architectures via sparsified neighborhood mixing
  5. Graph attention networks
  6. Geometric deep learning: going beyond euclidean data
  7. Graph structure learning for robust graph neural networks
  8. Geom-gcn: Geometric graph convolutional networks
  9. New benchmarks for learning on non-homophilous graphs
  10. Latent patient network learning for automatic diagnosis
  11. Few-shot learning with graph neural networks
  12. The graph neural network model
  13. Characteristic functions on graphs: Birds of a feather, from statistical descriptors to parametric models
  14. Beyond homophily in graph neural networks: Current limitations and effective designs
PDF Document
Self-Organized Paper (fully-generated by AI-Researcher, click to view).
profiles
Self-Organized Workplace, take time to load (fully-generated by AI-Researcher, click to view).

Example 7 (Category: Graph Neural Networks)

Input:Prompt

I have some reference papers, please implement the following idea with these papers:

  1. The proposed approach works on the task of uncovering data dependencies and learning instance representations from datasets that may not have complete or reliable relationships, particularly in semi-supervised contexts like node classification, image/text classification, and spatial-temporal dynamics prediction.
  2. ...
  1. The core techniques/algorithms used in this paper include an energy-constrained diffusion model represented as a partial differential equation (PDE), an explicit Euler scheme for numerical solutions, and a form of adaptive diffusivity function based on the energy function. The proposed architecture utilizes a diffusion-based Transformer framework that allows for all-pair feature propagation among instances.
  2. The major technical components serve the following purposes:
    • Diffusion Process: Encodes instances into evolving states by modeling information flow, where instance representations evolve according to a PDE illuminating the relationships among the instances.
    • Energy Function: Provides constraints to regularize the diffusion process and guide the proposed model towards desired low-energy embeddings, enhancing the quality of representations.
    • Diffusivity Function: Specifies the strength of information flow between instances, adapting based on the instance states, and allows for flexible and efficient propagation strategies.
  3. Implementation details for each component:
    • Diffusion Process Input: Requires a batch of instances represented as a matrix of size \(N \times D\), where \(N\) is the number of instances and \(D\) is the input feature dimension.
    • Diffusion Process Output: Produces the updated instance representations after \(K\) propagation steps. The step size \(\tau\) should be set within the range (0, 1).
    • Energy Function: Implemented as \(E(Z, k; \delta) = ||Z - Z^{(k)}||^2_F + \lambda \sum_{i,j} \delta(||z_i - z_j||^2_2)\), with \(\delta\) being a non-decreasing, concave function.
    • Key Parameters:
      • Step size \(\tau\)
      • Layer number \(K\) (number of diffusion propagation steps)
      • Regularization weight \(\lambda\).
  4. Step-by-step description of interactions:
    • Start by initializing the instance representations.
    • For each layer of diffusion, compute the diffusivity \(S(k)\) based on current embeddings through a function \(f\) which can be defined differently depending on the proposed model implementation.
    • Update the instance representations using the defined diffusion equations, ensuring to conserve states and introduce propagation according to the computed diffusivity.
    • After \(K\) layers of diffusion, apply a final output layer to produce logits for predictions.
  5. Critical implementation details that affect performance:
    • The choice of diffusivity function \(f\) greatly impacts the proposed model's capacity to learn complex dependencies, where specific formulations (like linear or logistic) yield different abilities in capturing inter-instance relationships.
    • Ensure that the values of \(\tau\) and \(\lambda\) are set appropriately to balance convergence speed and representation quality; using a smaller \(\tau\) may require deeper layers to learn effectively.
    • Optimization parameters like learning rate and early stopping criteria are essential, particularly for large-scale datasets where convergence behavior can vary widely depending on architecture size and complexity.
Input:Reference Papers
  1. Diffusion-convolutional neural networks
  2. ...
  1. Semi-supervised classification with graph convolutional networks
  2. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples
  3. Geometric deep learning: going beyond euclidean data
  4. Artificial neural networks for solving ordinary and partial differential equations
  5. Scaling graph neural networks with approximate pagerank
  6. Learning discrete structures for graph neural networks
  7. Semi-supervised learning using gaussian fields and harmonic functions
  8. Graph convolutional networks
  9. Deep learning via semi-supervised embedding
  10. A generalization of transformer networks to graphs
  11. Graph Convolution and Quadratic Time Complexity
  12. Bayesian graph convolutional neural networks for semi-supervised classification
  13. Do transformers really perform bad for graph representation?
  14. Big bird: Transformers for longer sequences
  15. Adaptive graph diffusion networks
  16. Transformers are RNNs
  17. Collective classification in network data
  18. NodeFormer: A scalable graph structure learning transformer for node classification
PDF Document
Self-Organized Paper (fully-generated by AI-Researcher, click to view).
profiles
Self-Organized Workplace, take time to load (fully-generated by AI-Researcher, click to view).

✨How AI-Researcher works

  • 🔄 End-to-End Scientific Research Automation System
    Our AI-Researcher provides comprehensive automation for the complete scientific research lifecycle through an integrated pipeline. The system orchestrates research activities across three strategic phases:
    1. Literature Review & Idea Generation 📚💡

      • 🔍 Resource Collector: Systematically gathers comprehensive research materials across multiple scientific domains through automated collection from major academic databases (e.g., arXiv, IEEE Xplore, ACM Digital Library, and Google Scholar), code platforms (e.g., GitHub, Hugging Face), and open datasets across scientific domains.

      • 🧠 Resource Filter: Evaluates and selects high-impact papers, well-maintained code implementations, and benchmark datasets through quality metrics (e.g., citation count, code maintenance, data completeness) and relevance assessment.

      • 💭 Idea Generator: Leveraging the identified research resources, including high-impact papers and code repositories, the Idea Generator systematically formulates novel research directions through comprehensive analysis. It automatedly evaluates current methodological limitations, map emerging technological trends, and explore uncharted research territories.

    2. New Algorithm Design, Implementation & Validation 🧪💻
      Design → Implementation → Validation → Refinement

      • 📝Design Phase: The initial phase focuses on conceptual development, where novel algorithmic ideas are formulated and theoretical foundations are established. During this stage, we carefully plan the implementation strategy, ensuring the proposed solution advances beyond existing approaches while maintaining practical feasibility.

      • ⚙️Implementation Phase: proceed to transform abstract concepts into concrete code implementations. This phase involves developing functional modules, establishing a robust testing environment, and creating necessary infrastructure for experimental validation.

      • 🔬Validation Phase: Systematic experimentation forms the core of our validation process. We execute comprehensive tests to evaluate algorithm performance, collect metrics, and document all findings. This phase ensures rigorous implementation verification with practical requirements.

      • 🔧Refinement Phase 🔬: Based on validation results, we enter an iterative refinement cycle. This phase involves identifying areas for improvement, optimizing code efficiency, and implementing necessary enhancements. We carefully analyze performance bottlenecks and plan strategic improvements for the next development iteration.

    3. Paper Writing ✍️📝

      • Writer Agent 📄: Automatically generates full-length academic papers by integrating research ideas, motivations, newly designed algorithm frameworks, and algorithm validation performance. Leveraging a hierarchical writing approach, it creates polished manuscripts with precision and clarity.

🚀 This fully automated system removes the need for manual intervention across the entire research lifecycle, enabling effortless and seamless scientific discovery—from initial concept to final publication. 🚀 It serves as an excellent research assistant, aiding researchers in achieving their goals efficiently and effectively.


  • 🔬 Comprehensive Benchmark Suite
    We have developed a comprehensive and standardized evaluation framework to objectively assess the academic capabilities of AI researchers and the quality of their scholarly work, integrating several key innovations to ensure thorough and reliable evaluation.

    1. 👨‍🔬 Expert-Level Ground Truth: TThe benchmark leverages human expert-written papers as ground truth references, establishing a high-quality standard for comparison and validation.

    2. 🌈 Multi-Domain Coverage: Our benchmark is designed to comprehensively span 4 major research domains, ensuring broad applicability: Computer Vision (CV), Nature Language Processing (NLP), Data Mining (DM), and Information Retrieval (IR).

    3. 🌐 Fully Open-Source Benchmark Construction: We have fully open-sourced the methodology and process for building the benchmark, including complete access to processed datasets, data collection pipelines, and processing code. This ensures Transparency in Evaluation while empowering the community to customize and construct benchmarks tailored to their specific domains for testing AI researchers.

    4. 📊 Comprehensive Evaluation Metrics: Our evaluation framework adopts a hierarchical and systematic approach, where tasks are organized into two levels based on the extent of idea provision. Leveraging specialized Evaluator Agents, the framework conducts thorough assessments across multiple dimensions, ensuring a robust and comprehensive evaluation. Key evaluation metrics include: 1) Novelty: Assessing the innovation and uniqueness of the research work. 2) Experimental Comprehensiveness: Evaluating the design, execution, and rigor of the experiments. 3) Theoretical Foundation: Measuring the strength of the theoretical background and foundations. 4) Result Analysis: Analyzing the depth and accuracy of result interpretation. 5) Writing Quality: Reviewing the clarity, coherence, and structure of the written report.

🚀 Advancing Research Automation. This benchmark suite provides an objective framework for assessing research automation capabilities. It is designed to evolve continuously, incorporating new advancements and expanding its scope to meet the growing demands of the research community.


  • 🌟 Easy-to-Use AI Research Assistant
    AI-ResearcherE delivers a truly seamless and accessible experience for research automation, empowering users to focus on innovation without technical barriers. Key features include:

    1. 🌐 Multi-LLM Provider Support: Effortlessly integrates with leading language model providers such as Claude, OpenAI, Deepseek, and more. Researchers can select the most suitable AI capabilities for their specific needs.

    2. 📚 Effortless Research Kickoff: Kickstart your research journey with unparalleled ease! Simply provide a list of relevant papers, and AI-Researcher takes care of the rest—no need to upload files, contribute initial ideas, or navigate complex configurations. It’s the ultimate tool to help you jumpstart your research process efficiently and effectively.

    3. 🧠 Minimal Domain Expertise Needed: AI-Researcher simplifies the research process by autonomously identifying critical research gaps, proposing innovative approaches, and executing the entire research pipeline. While some domain understanding can enhance results, the tool is designed to empower users of all expertise levels to achieve impactful outcomes with ease.

    4. 📦 Out-of-the-Box Functionality: Experience seamless research automation right from the start. AI-Researcher is ready to use with minimal setup, giving you instant access to advanced capabilities. Skip the hassle of complex configurations and dive straight into accelerating your research process with ease and efficiency.

🔍 How to use AI-Researcher

1. Research Agent

If you want to use research agent with the given idea (Level 1 tasks), conducting extensive survey and experiments, you can use the following command in the research_agent/run_infer_level_1.sh:

current_dir=$(dirname "$(readlink -f "$0")")
cd $current_dir
export DOCKER_WORKPLACE_NAME=workplace_paper

export BASE_IMAGES=tjbtech1/paperagent:latest

export COMPLETION_MODEL=claude-3-5-sonnet-20241022
export CHEEP_MODEL=claude-3-5-haiku-20241022

category=vq
instance_id=one_layer_vq
export GPUS='"device=0,1"'

python run_infer_plan.py --instance_path ../benchmark/final/${category}/${instance_id}.json --container_name paper_eval --task_level task1 --model $COMPLETION_MODEL --workplace_name workplace --cache_path cache --port 12372 --max_iter_times 0 --category ${category}

If you want to just give the reference papers, and let the research agent to generate the idea then conduct the experiments (Level 2 tasks), you can use the following command in the research_agent/run_infer_level_2.sh:

current_dir=$(dirname "$(readlink -f "$0")")
cd $current_dir
export DOCKER_WORKPLACE_NAME=workplace_paper

export BASE_IMAGES=tjbtech1/paperagent:latest

export COMPLETION_MODEL=claude-3-5-sonnet-20241022
export CHEEP_MODEL=claude-3-5-haiku-20241022

category=vq
instance_id=one_layer_vq
export GPUS='"device=0,1"'

python run_infer_idea.py --instance_path ../benchmark/final/${category}/${instance_id}.json --container_name paper_eval --model $COMPLETION_MODEL --workplace_name workplace --cache_path cache --port 12372 --max_iter_times 0 --category ${category}

2. Paper Writing Agent

If you want to generate the paper after the research agent has conducted the research, you can use the following command in the paper_agent/run_infer.sh:

#!/bin/bash

cd path/to/AI-Researcher/paper_agent

export OPENAI_API_KEY=sk-SKlupNntta4WPmvDCRo7uuPbYGwOnUQcb25Twn8c718tPpXN


research_field=vq
instance_id=rotated_vq

python path/to/AI-Researcher/paper_agent/writing.py --research_field ${research_field} --instance_id ${instance_id}

3. Benchmark Data and Collection

Our benchmark is also fully-open-sourced:

  • Detailed benchmark data is available in the benchmark folder.
  • Detailed benchmark collection process is available in the benchmark_collection folder.

📖 Documentation

Comprehensive documentation is on its way 🚀! Stay tuned for updates on our Documentation page.

🤝 Join the Community

We aim to build a vibrant community around AI-Researcher and warmly invite everyone to join us. Here’s how you can become part of our community:

🌟 Cite