Skip to content

Linguini is a benchmark to measure a language model’s linguistic reasoning skills without relying on pre-existing language-specific knowledge, based on the International Linguistic Olympiad problems.

License

Notifications You must be signed in to change notification settings

facebookresearch/linguini

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Linguini 🍝: A benchmark for language-agnostic linguistic reasoning

Paper

https://arxiv.org/pdf/2409.12126

Download the dataset

The latest version of the dataset can be downloaded here. It is available as a zip archive, with password linguisticreasoning. The data is only available in this format in order to avoid it being picked up by crawlers, which would lead to it being accidentally included in the sort of web corpora often used to train LLMs and large scale machine translation models, rendering it useless as a benchmark.

⚠️ Please note ⚠️:

  1. Please do not re-host this data as plain text in places where it might be picked up by web crawlers.
  2. If you are planning on evaluating your model with Linguini, you should ensure its contents are not in your training data.

See the CONTRIBUTING file for how to help out.

License

Linguini is CC-BY-SA licensed, as found in the LICENSE file.

About

Linguini is a benchmark to measure a language model’s linguistic reasoning skills without relying on pre-existing language-specific knowledge, based on the International Linguistic Olympiad problems.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published