Skip to content

logsdon-lab/CenMAP

Repository files navigation

CenMAP

CI GitHub Release

A centromere mapping and annotation pipeline for T2T human genome assemblies implemented in Snakemake.


Chr1 α-satellite higher-order repeat structure, centromere dip regions, and self-identity plot

Chr12 α-satellite HOR arrays

Cumulative α-satellite HOR array lengths
  • Verkko or hifiasm human genome assemblies
  • PacBio HiFi reads used in the assemblies
  • CHM13 reference genome assembly
  • (Optional) Unaligned BAM files with 5mC modifications at CpG sites.
  • Complete and correctly assembled centromere sequences and their regions validated by NucFlag.
  • Centromere alpha-satellite higher order repeat (HOR) array lengths via censtats.
  • RepeatMasker and HumAS-SD alpha-satellite HOR monomer annotations and plots.
  • ModDotPlot sequence identity plots.
  • Combined sequence identity and HOR array structure plots via cenplot.
  • (Optional) Centromere dip region (CDRs) with CDR-Finder

Read the docs on the CenMAP wiki.

To run tests, refer to the wiki page.

About

Centromere mapping and annotation pipeline

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages