Institute for Clinical and Translational Research, Biomedical Research Center, Slovak Academy of Sciences, Bratislava, Slovakia; Department of Molecular Biology, Faculty of Natural Sciences, Comenius University, Bratislava, Slovakia.
Department of Molecular Biology, Faculty of Natural Sciences, Comenius University, Bratislava, Slovakia; Institute of Molecular Biomedicine, Faculty of Medicine, Comenius University, Bratislava, Slovakia.
J Biotechnol. 2019 Jun 10;298:64-75. doi: 10.1016/j.jbiotec.2019.04.013. Epub 2019 Apr 15.
Although massively parallel sequencing (MPS) is becoming common practice in both research and routine clinical care, confirmation requirements of identified DNA variants using alternative methods are still topics of debate. When evaluating variants directly from MPS data, different read depth statistics, together with specialized genotype quality scores are, therefore, of high relevance. Here we report results of our validation study performed in two different ways: 1) confirmation of MPS identified variants using Sanger sequencing; and 2) simultaneous Sanger and MPS analysis of exons of selected genes. Detailed examination of false-positive and false-negative findings revealed typical error sources connected to low read depth/coverage, incomplete reference genome, indel realignment problems, as well as microsatellite associated amplification errors leading to base miss-calling. However, all these error types were identifiable with thorough manual revision of aligned reads according to specific patterns of distributions of variants and their corresponding reads. Moreover, our results point to dependence of both basic quantitative metrics (such as total read counts, alternative allele read counts and allelic balance) together with specific genotype quality scores on the used bioinformatics pipeline, stressing thus the need for establishing of specific thresholds for these metrics in each laboratory and for each involved pipeline independently.
虽然大规模平行测序(MPS)在研究和常规临床护理中已变得普遍,但使用替代方法确认鉴定出的 DNA 变体的确认要求仍然是争论的话题。在直接从 MPS 数据评估变体时,不同的读取深度统计信息以及专门的基因型质量评分具有重要意义。在这里,我们报告了以两种不同方式进行的验证研究的结果:1)使用 Sanger 测序确认 MPS 鉴定出的变体;2)同时对选定基因的外显子进行 Sanger 和 MPS 分析。对假阳性和假阴性发现的详细检查揭示了与低读取深度/覆盖度、不完整的参考基因组、插入缺失重对齐问题以及导致碱基误报的微卫星相关扩增错误相关的典型错误源。然而,所有这些错误类型都可以通过根据变体及其相应读取的分布的特定模式来彻底手动修订对齐的读取来识别。此外,我们的结果表明,基本定量指标(例如总读取计数、替代等位基因读取计数和等位基因平衡)以及特定基因型质量评分都取决于所使用的生物信息学管道,因此需要为每个实验室和每个独立的参与管道建立这些指标的特定阈值。