L²M: Mutual Information Scaling Law for Long-Context Language Modeling

Official code repository for the paper "L²M: Mutual Information Scaling Law for Long-Context Language Modeling".

Overview

This repository contains code for reproducing the experiments and results from our paper, which establishes a bipartite mutual information scaling law in natural language that governs long-range dependencies. We formulate the Long-context Language Modeling (L²M) condition, which relates a model's capacity for effective long context length modeling to the scaling of its latent state size for storing past information.

Figure 1: Illustration of the central ideas of our work.

Figure 2: Illustration and estimates of the scalings of both bipartite and two-point mutual information.

Repository Structure

The repository is organized as follows:

measure_mutual_info/: Code for estimating bipartite mutual information using LLMs as well as the two-point mutual information
train_on_pg19/: Code for experiments on the PG19 dataset

Citation

@misc{chen2025l2mmutualinformationscaling,
      title={L$^2$M: Mutual Information Scaling Law for Long-Context Language Modeling}, 
      author={Zhuo Chen and Oriol Mayné i Comas and Zhuotao Jin and Di Luo and Marin Soljačić},
      year={2025},
      eprint={2503.04725},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2503.04725}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
figures		figures
measure_mutual_info		measure_mutual_info
train_on_pg19		train_on_pg19
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

L²M: Mutual Information Scaling Law for Long-Context Language Modeling

Overview

Repository Structure

Citation

About

Releases

Packages

Languages

License

LSquaredM/mutual_info_scaling_law

Folders and files

Latest commit

History

Repository files navigation

L²M: Mutual Information Scaling Law for Long-Context Language Modeling

Overview

Repository Structure

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages