Skip to content

options

Susana Posada-Cespedes edited this page Jun 14, 2017 · 9 revisions

V-pipe: user configurable options

The workflow can be customized through the configuration file vpipe.config. This configuration file is a text file written using a basic struture composed of sections, properties and values. For instance, as input V-pipe expects a tabular file specifying sample unique identifiers (e.g., patient identifiers) and sample dates for different sequencing runs related to the same patient. The name of this file (here, samples.tsv) should be provided by specifying the section as input and the property as samples_file, as follows,

[input]
samples_file = samples.tsv

As shown above, sections are expected in squared brackets, and properties are followed by their values.

Below, we provide a comprehensive list of all user-configurable options stratified by sections.

input

datadir

Directory where samples are stored. By default, it is set to samples.

samples_file

File containg sample unique identifiers and dates as tab-separated values, e.g.,

patient1    20100113
patient1    20110202
patient2    20081130

Here, we have two samples from patient 1 and one sample from patient 2. By default, V-pipe searches for a file named samples.tsv, if this file does not exist, a list of samples is built by globbing datadir directory contents.

fastq_suffix

Fastq files are expected to be stored on a subdirectory named raw_data. For example, for patient 1 and the first sample, the hierarchy should look like

samples
└── patient1
    └── 20100113
        └──raw_data
           ├──patient1_20100113_R1.fastq
           └──patient1_20100113_R2.fastq

By default, V-pipe finds the fastq file matching the following pattern: prefix + R + {1,2} + .fastq. If a suffix should be introducing after R1 and R2, user needs to specify it thorugh this option.

Allocation of resources can variate with different input sizes. Users can specify memory and time requirements for all rules. For multi-threaded software packages, threads can be also customized.

gunzip

mem

time

extract

mem

time

preprocessing

mem

time

qual_threshold

Mean quality score used for filtering low-quality reads.

min_len

Reads shorter than min_len are filtered out.

initial_vicuna

mem

time

threads

initial_vicuna_msa

mem

time

threads

hmm_align

mem

time

threads

sam2bam

mem

time

bwa_align

mem

time

threads

coverage_QA

mem

time

msa

mem

time

threads

convert_to_hxb2

mem

time

Defaults for user configurable options are provided in vpipe.snake.

Clone this wiki locally