Chiang Derek Y, Getz Gad, Jaffe David B, O'Kelly Michael J T, Zhao Xiaojun, Carter Scott L, Russ Carsten, Nusbaum Chad, Meyerson Matthew, Lander Eric S
Broad Institute, Massachusetts Institute of Technology, 7 Cambridge Center, Cambridge, MA 02142, USA.
Nat Methods. 2009 Jan;6(1):99-103. doi: 10.1038/nmeth.1276. Epub 2008 Nov 30.
Cancer results from somatic alterations in key genes, including point mutations, copy-number alterations and structural rearrangements. A powerful way to discover cancer-causing genes is to identify genomic regions that show recurrent copy-number alterations (gains and losses) in tumor genomes. Recent advances in sequencing technologies suggest that massively parallel sequencing may provide a feasible alternative to DNA microarrays for detecting copy-number alterations. Here we present: (i) a statistical analysis of the power to detect copy-number alterations of a given size; (ii) SegSeq, an algorithm to segment equal copy numbers from massively parallel sequence data; and (iii) analysis of experimental data from three matched pairs of tumor and normal cell lines. We show that a collection of approximately 14 million aligned sequence reads from human cell lines has comparable power to detect events as the current generation of DNA microarrays and has over twofold better precision for localizing breakpoints (typically, to within approximately 1 kilobase).
癌症源于关键基因的体细胞改变,包括点突变、拷贝数改变和结构重排。发现致癌基因的一个有效方法是识别肿瘤基因组中显示出反复拷贝数改变(增加和减少)的基因组区域。测序技术的最新进展表明,大规模平行测序可能为检测拷贝数改变提供一种可行的替代DNA微阵列的方法。在此我们展示:(i)对检测给定大小拷贝数改变能力的统计分析;(ii)SegSeq,一种从大规模平行序列数据中分割相等拷贝数的算法;以及(iii)对来自三对匹配的肿瘤和正常细胞系的实验数据的分析。我们表明,来自人类细胞系的约1400万条比对序列读数集合在检测事件方面具有与当前一代DNA微阵列相当的能力,并且在定位断点方面(通常精确到约1千碱基内)具有超过两倍的精度。