Suppr超能文献

VirVarSeq:一种用于Illumina测序的低频病毒变异检测流程,采用自适应碱基识别准确性过滤。

VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering.

作者信息

Verbist Bie M P, Thys Kim, Reumers Joke, Wetzels Yves, Van der Borght Koen, Talloen Willem, Aerssens Jeroen, Clement Lieven, Thas Olivier

机构信息

Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia.

Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia.

出版信息

Bioinformatics. 2015 Jan 1;31(1):94-101. doi: 10.1093/bioinformatics/btu587. Epub 2014 Aug 31.

Abstract

MOTIVATION

In virology, massively parallel sequencing (MPS) opens many opportunities for studying viral quasi-species, e.g. in HIV-1- and HCV-infected patients. This is essential for understanding pathways to resistance, which can substantially improve treatment. Although MPS platforms allow in-depth characterization of sequence variation, their measurements still involve substantial technical noise. For Illumina sequencing, single base substitutions are the main error source and impede powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores (Qs) that are useful for differentiating errors from the real low-frequency mutations.

RESULTS

A variant calling tool, Q-cpileup, is proposed, which exploits the Qs of nucleotides in a filtering strategy to increase specificity. The tool is imbedded in an open-source pipeline, VirVarSeq, which allows variant calling starting from fastq files. Using both plasmid mixtures and clinical samples, we show that Q-cpileup is able to reduce the number of false-positive findings. The filtering strategy is adaptive and provides an optimized threshold for individual samples in each sequencing run. Additionally, linkage information is kept between single-nucleotide polymorphisms as variants are called at the codon level. This enables virologists to have an immediate biological interpretation of the reported variants with respect to their antiviral drug responses. A comparison with existing SNP caller tools reveals that calling variants at the codon level with Q-cpileup results in an outstanding sensitivity while maintaining a good specificity for variants with frequencies down to 0.5%.

AVAILABILITY

The VirVarSeq is available, together with a user's guide and test data, at sourceforge: http://sourceforge.net/projects/virtools/?source=directory.

摘要

动机

在病毒学中,大规模平行测序(MPS)为研究病毒准种带来了诸多机会,例如在感染HIV-1和HCV的患者中。这对于理解耐药途径至关重要,而耐药途径能够显著改善治疗效果。尽管MPS平台允许对序列变异进行深入表征,但其测量仍存在大量技术噪声。对于Illumina测序而言,单碱基替换是主要的错误来源,阻碍了对低频突变的有效评估。幸运的是,碱基识别辅以质量分数(Qs),这有助于区分错误与真正的低频突变。

结果

我们提出了一种变异检测工具Q-cpileup,它在过滤策略中利用核苷酸的质量分数来提高特异性。该工具嵌入到一个开源流程VirVarSeq中,该流程允许从fastq文件开始进行变异检测。使用质粒混合物和临床样本,我们表明Q-cpileup能够减少假阳性结果的数量。过滤策略具有适应性,可为每次测序运行中的单个样本提供优化阈值。此外,在密码子水平进行变异检测时,单核苷酸多态性之间的连锁信息得以保留。这使得病毒学家能够就报告的变异对抗病毒药物的反应立即进行生物学解读。与现有SNP检测工具的比较表明,使用Q-cpileup在密码子水平检测变异可获得出色的灵敏度,同时对频率低至0.5%的变异保持良好的特异性。

可用性

VirVarSeq连同用户指南和测试数据可在SourceForge上获取:http://sourceforge.net/projects/virtools/?source=directory

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验