Telomere-to-telomere consoritum primates project

T2T-Primates is a project of the Telomere-to-Telomere consortium and is led by the Makova, Phillippy, and Eichler labs. The project seeks to finish complete, diploid assemblies for key non-human primate species. The project is currently focused on gorilla, bonobo, chimpanzee, orangutan, and gibbon. Following the approach of the human T2T-CHM13 project, all species have been sequenced with high-coverage PacBio HiFi (~60x) and Oxford Nanopore ultra-long (~40x) sequencing reads. For haplotype phasing, Dovetail Hi-C data was generated for all genomes and Strand-seq data is also expected. Parental Illumina data was collected for bonobo and gorilla, where familial trios were available.

Phase one of the project is focused on completing the sex chromosomes; phase two will focus on finishing the autosomes of bonobo and gorilla; and phase three will focus on the remaining genomes. The project is currently in phase one, with draft T2T sex chromosome assemblies now available for all genomes.

Latest assembly releases

Version 1 diploid assemblies were generated with Verkko v1.1, and contigs were chromosome-assigned and oriented by alignment to the previous references. Both X and Y chromosomes are complete for all species listed. Gorilla and bonobo were phased using familial trios, and all others using Hi-C:

Gorilla gorilla (gorilla)
Pan paniscus (bonobo)
Pan troglodytes (chimpanzee)
Pongo abelii (Sumatran orangutan)
Pongo pygmaeus (Bornean orangutan)
Symphalangus syndactylus (siamang gibbon)

Downloads

All generated sequencing data and assemblies are available for browsing and download from GenomeArk.

Notes on downloading files

Files are generously hosted by Amazon Web Services under s3://genomeark. Although available as HTTP links above, download performance is improved by using the Amazon Web Services command-line interface. References should be amended to use the s3:// addressing scheme. Amending the max_concurrent_requests etc. settings as per this guide will improve download performance further.

Data reuse and license

All data is released to the public domain (CC0) and we encourage its reuse. However, we are in the process of finishing and analyzing these genomes, so to avoid duplicating effort, we encourage you to contact us if you are interested in contributing. The following working groups have been formed:

Assembly
Annotation
Sex chromosome genomics
Comparative and evolutionary genomics
Segmental duplications
Acrocentric chromosomes
Satellite DNAs
Mobile elements
Pangenomics

Contact

For any problems related to this dataset, please raise issues on this GitHub repository. For general questions regarding the project, please contact adam.phillippy@nih.gov. More information about our consortium can be found on the T2T homepage.

History

* Dec 2022. Initial release.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Telomere-to-telomere consoritum primates project

Latest assembly releases

Downloads

Notes on downloading files

Data reuse and license

Contact

History

Files

README.md

Latest commit

History

README.md

File metadata and controls

Telomere-to-telomere consoritum primates project

Latest assembly releases

Downloads

Notes on downloading files

Data reuse and license

Contact

History