Hsu Arthur L, Kondrashova Olga, Lunke Sebastian, Love Clare J, Meldrum Cliff, Marquis-Nicholson Renate, Corboy Greg, Pham Kym, Wakefield Matthew, Waring Paul M, Taylor Graham R
Department of Pathology, The University of Melbourne, Parkville, Victoria, Australia.
Hum Mutat. 2015 Apr;36(4):411-8. doi: 10.1002/humu.22763. Epub 2015 Mar 16.
Conventional means of identifying variants in high-throughput sequencing align each read against a reference sequence, and then call variants at each position. Here, we demonstrate an orthogonal means of identifying sequence variation by grouping the reads as amplicons prior to any alignment. We used AmpliVar to make key-value hashes of sequence reads and group reads as individual amplicons using a table of flanking sequences. Low-abundance reads were removed according to a selectable threshold, and reads above this threshold were aligned as groups, rather than as individual reads, permitting the use of sensitive alignment tools. We show that this approach is more sensitive, more specific, and more computationally efficient than comparable methods for the analysis of amplicon-based high-throughput sequencing data. The method can be extended to enable alignment-free confirmation of variants seen in hybridization capture target-enrichment data.
在高通量测序中,传统的变异识别方法是将每条 reads 与参考序列进行比对,然后在每个位置识别变异。在此,我们展示了一种正交的变异识别方法,即在进行任何比对之前,将 reads 作为扩增子进行分组。我们使用 AmpliVar 对序列 reads 生成键值哈希,并使用侧翼序列表将 reads 分组为单个扩增子。根据可选择的阈值去除低丰度 reads,高于此阈值的 reads 作为组进行比对,而不是作为单个 reads,从而允许使用敏感的比对工具。我们表明,对于基于扩增子的高通量测序数据分析,这种方法比同类方法更灵敏、更特异且计算效率更高。该方法可以扩展,以便对杂交捕获目标富集数据中发现的变异进行无需比对的确认。