Skip to content

A repository for sbatch scripts used for viral pipeline development

License

Notifications You must be signed in to change notification settings

mihinduk/HTCF_viral

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

HTCF_viral

A repository for sbatch scripts used for viral pipeline development

Current contig assembly and annotation example (post-Hecatomb):

1. Assemble contigs:
sbatch virome_assembly_kallisto_htcf_mod2.sbatch
NOTE: Please change this line to your user:
#SBATCH --mail-user=mihindu

2. Contig annotation:
sbatch contig_annotation_htcf_mod_v2.sbatch
NOTE: Please change this line to your user:
#SBATCH --mail-user=mihindu

3. Run cenote-taker2 on Pathogen: DNA
mkdir -p /path/to/your/project/assembly/contig_dictionary
cd /path/to/your/project/assembly/contig_dictionary
rsync your contig_dictionary.fasta file here
cd /path/to/your/project/
mkdir

# Set variables - now 999 to match CAT
MIN=999

nohup python /mnt/pathogen1/rrodgers/Cenote-Taker2/run_cenote-taker2.0.1.py
--contigs /mnt/pathogen1/kathiem/Jeffrey_IBD_VLP/assembly/contig_dictionary/contig_dictionary.fasta
--run_title Jeffrey_IBD_VLP_DNA_ct2_all
--template_file ../template.sbt
--mem 80 --cpu 20
--prune_prophage FALSE
--filter_out_plasmids FALSE
--minimum_length_circular $MIN
--minimum_length_linear $MIN
--hhsuite_tool hhsearch
--handle_contigs_without_hallmark sketch_all > DNA_out.log 2>&1 &

outfile: /mnt/pathogen1/kathiem/Jeffrey_IBD_VLP/Jeffrey_IBD_VLP_DNA_ct2_all/Jeffrey_IBD_VLP_DNA_ct2_all.tsv

4. Run cenote-taker2 RNA:
mkdir Jeffrey_IBD_VLP_RNA_ct2_all

4A. Modify infile to replace space in headers with @
sed 's/ /@/g' -i non_viral_domains_contigs.fna

4B. Find RNA viruses:
conda activate /mnt/pathogen1/rrodgers/miniconda2/envs/cenote-taker2_env

MIN=999

nohup python /mnt/pathogen1/rrodgers/Cenote-Taker2/run_cenote-taker2.0.1.py
--contigs /mnt/pathogen1/kathiem/Jeffrey_IBD_VLP/Jeffrey_IBD_VLP_DNA_ct2_all/other_contigs/non_viral_domains_contigs.fna
--run_title Jeffrey_IBD_VLP_RNA_ct2_all
--template_file ../template.sbt
--mem 80 --cpu 20
--virus_domain_db rna_virus
--prune_prophage FALSE
--filter_out_plasmids FALSE
--minimum_length_circular $MIN
--minimum_length_linear $MIN
--hhsuite_tool hhsearch
--handle_contigs_without_hallmark sketch_all > RNA_out.log 2>&1 &

outfile = Jeffrey_IBD_VLP_RNA_ct2_all.tsv

5. Perl scripts to happily marry CAT and CT2 results:
5A. Standardize Cenote-Taker2 output:
perl cenote-taker2_parser_v3.pl <Cenote-taker2 mode: DNA (default) or RNA>
perl cenote-taker2_parser_v3.pl Jeffrey_IBD_VLP_RNA_ct2_all.tsv RNA
Your outfile is Jeffrey_IBD_VLP_RNA_ct2_all_clean_tax.txt

perl cenote-taker2_parser_v3.pl Jeffrey_IBD_VLP_DNA_ct2_all.tsv DNA
Your outfile is Jeffrey_IBD_VLP_DNA_ct2_all_clean_tax.txt

5B. Standardize CAT output:
perl cat_taxonomy_supplement.pl <output from CAT (contig.taxonomy)> <flye assembler outfile (assembly_info.txt)> <CAT.scores from CAT>
perl cat_taxonomy_supplement.pl contig.taxonomy assembly_info.txt CAT.scores
Your parsed contig taxonomy file is: contig.taxonomy_clean_tax.txt

5C. Merge CAT and CT2 output
perl cenote-taker2_cat_taxonomy_merger.pl
perl cenote-taker2_cat_taxonomy_merger.pl Jeffrey_IBD_VLP_DNA_ct2_all_clean_tax.txt Jeffrey_IBD_VLP_RNA_ct2_all_clean_tax.txt contig.taxonomy_clean_tax.txt Jeffrey_IBD_VLP_CAT_ct2
Your R taxonomy infile is Jeffrey_IBD_VLP_CAT_ct2_CT2_CAT_contig_taxonomy.txt

About

A repository for sbatch scripts used for viral pipeline development

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published