Releases: opencb/hpg-variant
First stable release
First stable release! Comments and feedback are welcome!
Not many changes since the last release candidate, many of them oriented to improve usability:
- Support for multigenerational families in PED files
- Merge tool notifies when files are unsorted
- All tools notify when output files can't be created
- Sample statistics added to database generated by hpg-var-vcf stats
- Logging output redirection: DEBUG/INFO to stdout, WARN/ERROR/FATAL to stderr
Great performance improvements and full support for VCF v4.1
Great reduction of memory usage in the VCF merging tool allows to merge hundreds of VCF files at twice the previous speed
Full support for Variant Call Format v4.1, including structural variants and novel adjacencies with breakends
Command-line autocompletion in Bash shell
SNP ID shown in GWAS output report
New statistics per group and usability improved
VCF stats: Statistics can be calculated using any column in the PED as grouping criteria, so not only case-control grouping can be performed, but also by sex, population, and so on
VCF merge:
- It no longer gets blocked when only one file pending
- Removes duplicates from the FILTER column
New command-line options: --version, --log-level
Parallelism configuration made easier by removing the "entries-per-thread" option (now calculated automatically)
Better handling of multi-allelic variants
VCF merging tool more tolerant to different reference alleles (can be configured via the –strict-ref argument)
hpg-var-effect checks all alternate alleles of a single variant
Memory leaks in hpg-var-effect supressed
VCF tools vastly improved, much faster VCF merge
VCF filter
- New filters by gene, region+type, being or not an indel, and inheritance pattern (dominant/recessive)
- Great performance improvement in single-core upto 2x
- Multi-threaded implementation
- GFF/BED as input for region filtering
VCF statistics
- New statistics about mendelian errors per sample, being or not an indel, inheritance pattern per variant
- Saving result statistics to SQLite DB file
VCF merge: Great reduction of memory usage, performance improvement upto 3x
VCF split: By coverage intervals
Bug solved: Accept VCF files with arbitrary header length
Miscellaneous: Some library dependencies packaged inside the application
Usability improvements
Filter output uses default names (your_vcf_file.vcf.filtered and your_vcf_file.vcf.rejected), –out argument reserved for tool output
Merge tool notifies when a sample appears more than once
GWAS analysis notify when a sample appears more than once
GWAS analysis improved (again!)
GWAS analysis: Properly manage individuals with no ancestors nor sex specified
GWAS analysis improved
GWAS analysis: Properly process PED files with no family structure, and VCF files with less samples than the PED file
Filter of VCF files by percentage of missing values
Effect tool tries to reconnect when an error occurs
Effect tool tries to reconnect 3 times when an error occurs. VCF records still not processed are written to a file.
New VCF tools included
Split in multiple binaries: hpg-var-effect, hpg-var-gwas, hpg-var-vcf
Sorting results of GWAS tests
Tools for preprocessing Variant Call Format (VCF) files:
- Merge multiple files
- Filter by minimum allele frequency (MAF)
- Configuring whether to write to file the VCF records rejected by a filter
SCons used as build system instead of Make