Dipartimento di Bioscienze, Università degli Studi di Milano, Milan, Italy.
SCAI, Cineca, Consorzio Interuniversitario di Supercalcolo, Rome, Italy.
BMC Genomics. 2018 Feb 5;19(1):120. doi: 10.1186/s12864-018-4508-1.
The advent and ongoing development of next generation sequencing technologies (NGS) has led to a rapid increase in the rate of human genome re-sequencing data, paving the way for personalized genomics and precision medicine. The body of genome resequencing data is progressively increasing underlining the need for accurate and time-effective bioinformatics systems for genotyping - a crucial prerequisite for identification of candidate causal mutations in diagnostic screens.
Here we present CoVaCS, a fully automated, highly accurate system with a web based graphical interface for genotyping and variant annotation. Extensive tests on a gold standard benchmark data-set -the NA12878 Illumina platinum genome- confirm that call-sets based on our consensus strategy are completely in line with those attained by similar command line based approaches, and far more accurate than call-sets from any individual tool. Importantly our system exhibits better sensitivity and higher specificity than equivalent commercial software.
CoVaCS offers optimized pipelines integrating state of the art tools for variant calling and annotation for whole genome sequencing (WGS), whole-exome sequencing (WES) and target-gene sequencing (TGS) data. The system is currently hosted at Cineca, and offers the speed of a HPC computing facility, a crucial consideration when large numbers of samples must be analysed. Importantly, all the analyses are performed automatically allowing high reproducibility of the results. As such, we believe that CoVaCS can be a valuable tool for the analysis of human genome resequencing studies. CoVaCS is available at: https://bioinformatics.cineca.it/covacs .
下一代测序技术(NGS)的出现和不断发展,导致人类基因组重测序数据的速度迅速增加,为个性化基因组学和精准医学铺平了道路。基因组重测序数据的数量不断增加,这凸显了需要开发准确、高效的生物信息学系统进行基因分型,这是诊断筛选中识别候选因果突变的关键前提。
在这里,我们展示了 CoVaCS,这是一个完全自动化、高度准确的系统,具有基于网络的图形界面,用于基因分型和变异注释。在一个黄金标准基准数据集(即 Illumina 白金基因组的 NA12878)上进行的广泛测试证实,基于我们共识策略的调用集完全符合类似命令行方法获得的调用集,并且比任何单个工具的调用集都要准确得多。重要的是,我们的系统比等效的商业软件具有更高的灵敏度和特异性。
CoVaCS 提供了优化的管道,集成了用于全基因组测序(WGS)、外显子组测序(WES)和靶基因测序(TGS)数据的变异调用和注释的最先进工具。该系统目前托管在 Cineca,提供了 HPC 计算设施的速度,这在必须分析大量样本时是一个关键考虑因素。重要的是,所有分析都是自动执行的,允许结果具有高度可重复性。因此,我们相信 CoVaCS 可以成为人类基因组重测序研究分析的有价值的工具。CoVaCS 可在:https://bioinformatics.cineca.it/covacs 获得。