This project investigates model editing techniques for Large Language Models (LLMs), focusing on how these models can be modified in specific domains without compromising overall performance.
- Evaluate Model Editing using different metrics across different architectures
- Determine the model-independence of editing techniques
- Generalization: How well the model answers queries related to edited facts
- Locality: Ensuring unedited queries remain unaffected
- Accuracy: Correctness of answers for edited facts
- Portability: Ability to handle multi-hop questions
Large Language Models (LLMs) store knowledge in their parametric memory through complex neural network weights. However, updating this knowledge is challenging due to:
- High computational cost of full model retraining
- Risk of catastrophic forgetting
- Maintaining model performance across different domains
Model editing aims to modify specific factual knowledge in LLMs without:
- Complete model retraining
- Significant performance degradation
- Disrupting existing knowledge representations
-
Preservation-Based Methods
-
Memory-Based Approaches
- Store edit examples separately
- Guide model output for specific inputs
- Minimize direct parameter modification
- Examples: SERAC, MemPrompt, IKE
-
Additional Parameter Techniques
- Introduce small, trainable parameter sets
- Focused control over specific edits
- Minimize impact on core model
- Examples: T-Patcher, CaliNET, GRACE
-
-
Direct Parameter Modification Methods
-
Locate-Then-Edit Approaches
- Identify specific parameters related to target knowledge
- Directly modify identified parameters
- Precise but potentially invasive
- Examples: ROME, MEMIT, PMET
-
Meta-Learning Techniques
- Use a hypernetwork as a "teacher"
- Learn optimal parameter update strategies
- Minimize negative side effects
- Examples: MEND, KE
-
- ZsRE
- COUNTERFACT
- MQuAkE (for multi-hop prompts)
- Memory-based Models (e.g., SERAC, MemPrompt)
- Additional Parameters Techniques (e.g., T-Patcher, CaliNET)
- Locate-Then-Edit Approaches (e.g., ROME, MEMIT)
- Meta-learning Techniques (e.g., MEND)
Technique | Generalization | Locality | Accuracy |
---|---|---|---|
ROME | 96.5 | 75.41 | 100 |
MEND | 58.3 | 44.88 | 94.2 |
Technique | Generalization | Locality | Accuracy |
---|---|---|---|
ROME | 92.45 | 87.04 | 99.63 |
MEND | 94.24 | 90.27 | 97.04 |
- Models with higher parameters perform better with editing techniques
- Multi-hop prompting remains a challenging area for most LLMs
- ROME showed superior generalization compared to MEND
- Model editing offers a computationally efficient alternative to full retraining
- Potential to improve LLM performance in specific domains