Alioto Tyler S, Buchhalter Ivo, Derdak Sophia, Hutter Barbara, Eldridge Matthew D, Hovig Eivind, Heisler Lawrence E, Beck Timothy A, Simpson Jared T, Tonon Laurie, Sertier Anne-Sophie, Patch Ann-Marie, Jäger Natalie, Ginsbach Philip, Drews Ruben, Paramasivam Nagarajan, Kabbe Rolf, Chotewutmontri Sasithorn, Diessl Nicolle, Previti Christopher, Schmidt Sabine, Brors Benedikt, Feuerbach Lars, Heinold Michael, Gröbner Susanne, Korshunov Andrey, Tarpey Patrick S, Butler Adam P, Hinton Jonathan, Jones David, Menzies Andrew, Raine Keiran, Shepherd Rebecca, Stebbings Lucy, Teague Jon W, Ribeca Paolo, Giner Francesc Castro, Beltran Sergi, Raineri Emanuele, Dabad Marc, Heath Simon C, Gut Marta, Denroche Robert E, Harding Nicholas J, Yamaguchi Takafumi N, Fujimoto Akihiro, Nakagawa Hidewaki, Quesada Víctor, Valdés-Mas Rafael, Nakken Sigve, Vodák Daniel, Bower Lawrence, Lynch Andrew G, Anderson Charlotte L, Waddell Nicola, Pearson John V, Grimmond Sean M, Peto Myron, Spellman Paul, He Minghui, Kandoth Cyriac, Lee Semin, Zhang John, Létourneau Louis, Ma Singer, Seth Sahil, Torrents David, Xi Liu, Wheeler David A, López-Otín Carlos, Campo Elías, Campbell Peter J, Boutros Paul C, Puente Xose S, Gerhard Daniela S, Pfister Stefan M, McPherson John D, Hudson Thomas J, Schlesner Matthias, Lichter Peter, Eils Roland, Jones David T W, Gut Ivo G
CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain.
Universitat Pompeu Fabra (UPF), 08002 Barcelona, Spain.
Nat Commun. 2015 Dec 9;6:10001. doi: 10.1038/ncomms10001.
As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼ 100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.
随着用于癌症基因组分析的全基因组测序成为一种临床工具,需要全面了解影响测序分析结果的各种变量。在这里,我们使用来自两种不同类型癌症——慢性淋巴细胞白血病和髓母细胞瘤的肿瘤-正常样本对,在国际癌症基因组联盟的背景下进行了一项基准测试。我们比较了测序方法、分析流程和验证方法。我们表明,使用无PCR方法并将测序深度增加到约100×会带来好处,只要肿瘤与对照的覆盖比保持平衡。我们观察到不同分析流程之间的突变检出率差异很大且一致性较低,这反映了原始数据易于产生假象的性质以及处理假象的标准缺失。然而,我们表明,使用我们创建的基准突变集,许多问题实际上很容易补救,并且对突变检测准确性有直接的积极影响。