You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to generate a small, controlled in silico test dataset for the viral metagenomics pipeline. This will be used for validating functionality and benchmarking performance.
The dataset should include:
A metadata.csv file with a list of organisms (viruses, bacteria, fungi, and host genomes like human or other mammals)
Simulated FASTQ files for both Illumina and Nanopore sequencing platforms
For each sample, we should define the expected % of reads from each organism (to later test detection/quantification accuracy)
The goal is to create synthetic samples that closely resemble real-world metagenomic data in complexity and composition.
Deliverables:
metadata.csv with organism names and assigned abundance percentages per sample
Paired-end FASTQ files for Illumina
Single-end FASTQ files (or appropriate format) for Nanopore
Clear documentation on how the data was generated (tools, parameters, etc.)
Useful tools for in silico dataset generation:
Here are some open-source tools that can help with generating synthetic metagenomic data:
Description:
We need to generate a small, controlled in silico test dataset for the viral metagenomics pipeline. This will be used for validating functionality and benchmarking performance.
The dataset should include:
metadata.csv
file with a list of organisms (viruses, bacteria, fungi, and host genomes like human or other mammals)The goal is to create synthetic samples that closely resemble real-world metagenomic data in complexity and composition.
Deliverables:
metadata.csv
with organism names and assigned abundance percentages per sampleUseful tools for in silico dataset generation:
Here are some open-source tools that can help with generating synthetic metagenomic data:
[CAMISIM](https://github.com/CAMI-challenge/CAMISIM)
[InSilicoSeq](https://github.com/HadrienG/InSilicoSeq)
[NanoSim](https://github.com/bcgsc/NanoSim)
[art](https://www.niehs.nih.gov/research/resources/software/biostatistics/art/index.cfm)
[Grinder](https://github.com/zlinsly/grinder)
[NeatSeq-Flow’s simulator module](https://neatseq-flow.readthedocs.io/en/latest/Modules.html#simulatefastq)
Next steps:
The text was updated successfully, but these errors were encountered: