Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, United States of America.
PLoS One. 2011;6(8):e22961. doi: 10.1371/journal.pone.0022961. Epub 2011 Aug 2.
Lung cancer, of which more than 80% is non-small cell, is the leading cause of cancer-related death in the United States. Copy number alterations (CNAs) in lung cancer have been shown to be positionally clustered in certain genomic regions. However, it remains unclear whether genes with copy number changes are functionally clustered. Using a dense single nucleotide polymorphism array, we performed genome-wide copy number analyses of a large collection of non-small cell lung tumors (n = 301). We proposed a formal statistical test for CNAs between different groups (e.g., non-involved lung vs. tumors, early vs. late stage tumors). We also customized the gene set enrichment analysis (GSEA) algorithm to investigate the overrepresentation of genes with CNAs in predefined biological pathways and gene sets (i.e., functional clustering). We found that CNAs events increase substantially from germline, early stage to late stage tumor. In addition to genomic position, CNAs tend to occur away from the gene locations, especially in germline, non-involved tissue and early stage tumors. Such tendency decreases from germline to early stage and then to late stage tumors, suggesting a relaxation of selection during tumor progression. Furthermore, genes with CNAs in non-small cell lung tumors were enriched in certain gene sets and biological pathways that play crucial roles in oncogenesis and cancer progression, demonstrating the functional aspect of CNAs in the context of biological pathways that were overlooked previously. We conclude that CNAs increase with disease progression and CNAs are both positionally and functionally clustered. The potential functional capabilities acquired via CNAs may be sufficient for normal cells to transform into malignant cells.
肺癌,其中超过 80%是非小细胞肺癌,是美国癌症相关死亡的主要原因。肺癌中的拷贝数改变 (CNAs) 已被证明在某些基因组区域中呈位置聚集。然而,目前尚不清楚具有拷贝数变化的基因是否在功能上聚集。我们使用密集的单核苷酸多态性阵列对大量非小细胞肺癌肿瘤(n=301)进行了全基因组拷贝数分析。我们提出了一种用于不同组(例如未受影响的肺与肿瘤、早期与晚期肿瘤)之间 CNA 的正式统计检验。我们还定制了基因集富集分析 (GSEA) 算法来研究具有 CNA 的基因在预定义的生物学途径和基因集中的过度表达(即功能聚类)。我们发现 CNA 事件从种系、早期阶段到晚期阶段肿瘤显著增加。除了基因组位置外,CNA 往往远离基因位置发生,特别是在种系、未受影响的组织和早期肿瘤中。这种趋势从种系到早期阶段再到晚期阶段肿瘤降低,表明在肿瘤进展过程中选择放松。此外,非小细胞肺癌肿瘤中具有 CNA 的基因在某些基因集和生物学途径中富集,这些基因集和生物学途径在肿瘤发生和癌症进展中起着至关重要的作用,证明了 CNA 在以前被忽视的生物学途径中的功能方面。我们得出结论,CNA 随着疾病的进展而增加,并且 CNA 无论是在位置上还是在功能上都呈聚集状态。通过 CNA 获得的潜在功能能力可能足以使正常细胞转化为恶性细胞。