The Genome Institute, Washington University, St. Louis, MO 63108, USA.
Genome Res. 2012 Mar;22(3):568-76. doi: 10.1101/gr.129684.111. Epub 2012 Feb 2.
Cancer is a disease driven by genetic variation and mutation. Exome sequencing can be utilized for discovering these variants and mutations across hundreds of tumors. Here we present an analysis tool, VarScan 2, for the detection of somatic mutations and copy number alterations (CNAs) in exome data from tumor-normal pairs. Unlike most current approaches, our algorithm reads data from both samples simultaneously; a heuristic and statistical algorithm detects sequence variants and classifies them by somatic status (germline, somatic, or LOH); while a comparison of normalized read depth delineates relative copy number changes. We apply these methods to the analysis of exome sequence data from 151 high-grade ovarian tumors characterized as part of the Cancer Genome Atlas (TCGA). We validated some 7790 somatic coding mutations, achieving 93% sensitivity and 85% precision for single nucleotide variant (SNV) detection. Exome-based CNA analysis identified 29 large-scale alterations and 619 focal events per tumor on average. As in our previous analysis of these data, we observed frequent amplification of oncogenes (e.g., CCNE1, MYC) and deletion of tumor suppressors (NF1, PTEN, and CDKN2A). We searched for additional recurrent focal CNAs using the correlation matrix diagonal segmentation (CMDS) algorithm, which identified 424 significant events affecting 582 genes. Taken together, our results demonstrate the robust performance of VarScan 2 for somatic mutation and CNA detection and shed new light on the landscape of genetic alterations in ovarian cancer.
癌症是一种由遗传变异和突变驱动的疾病。外显子组测序可用于发现数百个肿瘤中的这些变体和突变。在这里,我们提出了一种分析工具 VarScan 2,用于检测肿瘤-正常对中外显子数据中的体细胞突变和拷贝数改变(CNAs)。与大多数当前方法不同,我们的算法同时读取两个样本的数据;启发式和统计算法检测序列变体,并根据体细胞状态(种系、体细胞或 LOH)对其进行分类;而归一化读取深度的比较则描绘了相对拷贝数的变化。我们将这些方法应用于癌症基因组图谱(TCGA)部分特征的 151 例高级别卵巢肿瘤的外显子组序列数据的分析。我们验证了大约 7790 个体细胞编码突变,单核苷酸变体(SNV)检测的灵敏度达到 93%,精度达到 85%。基于外显子的 CNA 分析平均每个肿瘤识别到 29 个大规模改变和 619 个焦点事件。与我们之前对这些数据的分析一样,我们观察到癌基因(例如 CCNE1、MYC)的频繁扩增和肿瘤抑制基因(NF1、PTEN 和 CDKN2A)的缺失。我们使用相关矩阵对角线分割(CMDS)算法搜索其他复发性焦点 CNA,该算法鉴定了影响 582 个基因的 424 个显著事件。总之,我们的结果证明了 VarScan 2 用于体细胞突变和 CNA 检测的强大性能,并揭示了卵巢癌中遗传改变的新景观。