Open
Description
A while back, this method got added to VRS-Python to help with HGVS translation:
def extract_sequence_type(alias: str) -> str | None:
"""Provide a convenient way to extract the sequence type from an accession by matching its prefix to a known set of prefixes.
Args:
alias (str): The accession string.
Returns:
str or None: The sequence type associated with the accession string, or None if no matching prefix is found.
"""
prefix_dict = {
"refseq:NM_": "c",
"refseq:NC_012920": "m",
"refseq:NG_": "g",
"refseq:NC_00": "g",
"refseq:NW_": "g",
"refseq:NT_": "g",
"refseq:NR_": "n",
"refseq:NP_": "p",
"refseq:XM_": "c",
"refseq:XR_": "n",
"refseq:XP_": "p",
"GRCh": "g",
}
for prefix, seq_type in prefix_dict.items():
if alias.startswith(prefix):
return seq_type
return None
I don't really know the context or whether something here already fulfills this need, but it struck me as a bioutils-esque task and I figured I'd throw out the idea of moving it here.
Metadata
Metadata
Assignees
Labels
No labels