MuDAF

(in progress)

This is the official repo for our paper "MuDAF: Long-Context Multi-Document Attention Focusing through Contrastive Learning on Attention Heads".

Introduction

Long-context LLMs (LCLMs) can be easily distracted by irrelevant context, making them unable to fully utilize their long-context capabilities. How can we make them more focused?

In this work, we identify retrieval heads in multi-document question answering (MDQA), which can attend to golden passages for a given question and show patterns different from those found in the Needle-in-a-Haystack (NIAH) test.

Our method can explicitly enhance the retrieval capabilities of models' attention heads, making them more focused in MDQA. Our experiments show great potential in utilizing contrastive learning to adjust the attention distribution of some specific attention heads.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
data		data
dataset		dataset
eval		eval
imgs		imgs
scripts		scripts
transformers		transformers
utils		utils
vis		vis
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
hotpotqa_golden_passages_ids.json		hotpotqa_golden_passages_ids.json
hotpotqa_inputs.jsonl		hotpotqa_inputs.jsonl
hotpotqa_outputs.json		hotpotqa_outputs.json
requirements.txt		requirements.txt
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MuDAF

Introduction

About

Releases

Packages

Languages

License

NeosKnight233/MuDAF

Folders and files

Latest commit

History

Repository files navigation

MuDAF

Introduction

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages