School of Molecular and Biomedical Science and School of Mathematical Sciences, University of Adelaide, South Australia, Australia.
Bioinformatics. 2013 Sep 15;29(18):2223-30. doi: 10.1093/bioinformatics/btt375. Epub 2013 Jul 9.
MOTIVATION: With the advent of relatively affordable high-throughput technologies, DNA sequencing of cancers is now common practice in cancer research projects and will be increasingly used in clinical practice to inform diagnosis and treatment. Somatic (cancer-only) single nucleotide variants (SNVs) are the simplest class of mutation, yet their identification in DNA sequencing data is confounded by germline polymorphisms, tumour heterogeneity and sequencing and analysis errors. Four recently published algorithms for the detection of somatic SNV sites in matched cancer-normal sequencing datasets are VarScan, SomaticSniper, JointSNVMix and Strelka. In this analysis, we apply these four SNV calling algorithms to cancer-normal Illumina exome sequencing of a chronic myeloid leukaemia (CML) patient. The candidate SNV sites returned by each algorithm are filtered to remove likely false positives, then characterized and compared to investigate the strengths and weaknesses of each SNV calling algorithm. RESULTS: Comparing the candidate SNV sets returned by VarScan, SomaticSniper, JointSNVMix2 and Strelka revealed substantial differences with respect to the number and character of sites returned; the somatic probability scores assigned to the same sites; their susceptibility to various sources of noise; and their sensitivities to low-allelic-fraction candidates. AVAILABILITY: Data accession number SRA081939, code at http://code.google.com/p/snv-caller-review/ CONTACT: david.adelson@adelaide.edu.au SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
动机:随着相对负担得起的高通量技术的出现,癌症的 DNA 测序现在是癌症研究项目中的常见做法,并且将越来越多地用于临床实践,以告知诊断和治疗。体细胞(仅癌症)单核苷酸变体(SNV)是最简单的突变类别,但它们在 DNA 测序数据中的识别受到种系多态性、肿瘤异质性以及测序和分析错误的混淆。最近发表的四个用于检测匹配的癌症-正常测序数据集中体细胞 SNV 位点的算法是 VarScan、SomaticSniper、JointSNVMix 和 Strelka。在这项分析中,我们将这四个 SNV 调用算法应用于慢性髓性白血病(CML)患者的癌症-正常 Illumina 外显子测序。通过每个算法返回的候选 SNV 位点被过滤以去除可能的假阳性,然后对其进行特征描述和比较,以研究每个 SNV 调用算法的优缺点。 结果:比较 VarScan、SomaticSniper、JointSNVMix2 和 Strelka 返回的候选 SNV 集,在返回的位点数量和特征、分配给相同位点的体细胞概率得分、对各种噪声源的敏感性以及对低等位基因分数候选物的敏感性方面存在显著差异。 可用性:数据访问号 SRA081939,代码位于 http://code.google.com/p/snv-caller-review/ 联系人:david.adelson@adelaide.edu.au 补充信息:补充数据可在 Bioinformatics 在线获取。
Bioinformatics. 2013-7-9
BMC Genomics. 2014-3-28
Bioinformatics. 2022-9-15
BMC Med Genomics. 2020-10-15
Nucleic Acids Res. 2024-8-12
Animals (Basel). 2022-12-28
Am J Cancer Res. 2021-11-15
Nat Biotechnol. 2013-2-10
Bioinformatics. 2012-5-10
Biochem Pharmacol. 2011-12-16
Bioinformatics. 2011-12-6
BMC Bioinformatics. 2011-11-21
Nat Rev Cancer. 2011-5-19
Nucleic Acids Res. 2011-5-16