Division of Molecular Pathology, Department of Pathology, Hong Kong Sanatorium & Hospital, Happy Valley, Hong Kong SAR, China.
Department of Surgery, The University of Hong Kong, Happy Valley, Hong Kong SAR, China.
Sci Rep. 2017 May 8;7(1):1567. doi: 10.1038/s41598-017-01703-6.
Amplicon-based next-generation sequencing (NGS) has been widely adopted for genetic variation detection in human and other organisms. Conventional data analysis paradigm includes primer trimming before read mapping. Here we introduce BAMClipper that removes primer sequences after mapping original sequencing reads by soft-clipping SAM/BAM alignments. Mutation detection accuracy was affected by the choice of primer handling approach based on real NGS datasets of 7 human peripheral blood or breast cancer tissue samples with known BRCA1/BRCA2 mutations and >130000 simulated NGS datasets with unique mutations. BAMClipper approach detected a BRCA1 deletion (c.1620_1636del) that was otherwise missed due to edge effect. Simulation showed high false-negative rate when primers were perfectly trimmed as in conventional practice. Among the other 6 samples, variant allele frequencies of 5 BRCA1/BRCA2 mutations (indel or single-nucleotide variants) were diluted by apparently wild-type primer sequences from an overlapping amplicon (17 to 82% under-estimation). BAMClipper was robust in both situations and all 7 mutations were detected. When compared with Cutadapt, BAMClipper was faster and maintained equally high primer removal effectiveness. BAMClipper is implemented in Perl and is available under an open source MIT license at https://github.com/tommyau/bamclipper.
基于扩增子的新一代测序(NGS)已被广泛应用于人类和其他生物的遗传变异检测。传统数据分析方法包括在读取映射前进行引物修剪。本文介绍了 BAMClipper,它通过软剪辑 SAM/BAM 比对,在映射原始测序读取后去除引物序列。基于已知 BRCA1/BRCA2 突变的 7 个人外周血或乳腺癌组织样本和 >130000 个具有独特突变的模拟 NGS 数据集的真实 NGS 数据集,研究了引物处理方法的选择对突变检测准确性的影响。BAMClipper 方法检测到了 BRCA1 缺失(c.1620_1636del),否则由于边缘效应而被遗漏。模拟显示,当引物如传统方法中那样完美修剪时,假阴性率很高。在其他 6 个样本中,5 个 BRCA1/BRCA2 突变(插入或单核苷酸变异)的变异等位基因频率被来自重叠扩增子的明显野生型引物序列稀释(低估 17%至 82%)。BAMClipper 在这两种情况下都很稳健,所有 7 个突变都被检测到。与 Cutadapt 相比,BAMClipper 速度更快,并且保持了相同的高引物去除效率。BAMClipper 是用 Perl 实现的,并在 https://github.com/tommyau/bamclipper 上以开源 MIT 许可证提供。