Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030, USA.
Genetics. 2021 Mar 3;217(1):1-12. doi: 10.1093/genetics/iyaa021.
Somatic copy number alterations (SCNAs) serve as hallmarks of tumorigenesis and often result in deviations from one-to-one allelic ratios at heterozygous loci, leading to allelic imbalance (AI). The Cancer Genome Atlas (TCGA) reports SCNAs identified using a circular binary segmentation algorithm, providing segment mean copy number estimates from single-nucleotide polymorphism DNA microarray total intensities (log R ratio), but not allele-specific intensities ("B allele" frequencies) that inform of AI. Our approach provides more sensitive identification of SCNAs by modeling the "B allele" frequencies jointly, thereby bolstering the catalog of chromosomal alterations in this widely utilized resource. Here we present AI summaries for all 33 tumor sites in TCGA, including those induced by SCNAs and copy-neutral loss-of-heterozygosity (cnLOH). We identified AI in 94% of the tumors, higher than in previous reports. Recurrent events included deletions of 17p, 9q, 3p, amplifications of 8q, 1q, 7p, as well as mixed event types on 8p and 13q. We also observed both site-specific and pan-cancer (spanning 17p) cnLOH, patterns which have not been comprehensively characterized. The identification of such cnLOH events elucidates tumor suppressors and multi-hit pathways to carcinogenesis. We also contrast the landscapes inferred from AI- and total intensity-derived SCNAs and propose an automated procedure to improve and adjust SCNAs in TCGA for cases where high levels of aneuploidy obscured baseline intensity identification. Our findings support the exploration of additional methods for robust automated inference procedures and to aid empirical discoveries across TCGA.
体细胞拷贝数改变 (SCNAs) 是肿瘤发生的标志,通常导致杂合位点的等位基因比例偏离 1:1,导致等位基因失衡 (AI)。癌症基因组图谱 (TCGA) 使用环形二进制分割算法报告 SCNAs,提供从单核苷酸多态性 DNA 微阵列总强度 (log R 比) 中得出的片段平均拷贝数估计值,但不提供等位基因特异性强度 (“B 等位基因”频率),该频率可提供 AI 的信息。我们的方法通过联合建模“B 等位基因”频率来更敏感地识别 SCNAs,从而增强了这个广泛使用资源中的染色体改变目录。在这里,我们为 TCGA 中的所有 33 个肿瘤部位提供 AI 摘要,包括由 SCNAs 和拷贝数中性杂合性丢失 (cnLOH) 引起的 AI。我们在 94%的肿瘤中发现了 AI,高于以前的报告。反复出现的事件包括 17p、9q、3p 的缺失,8q、1q、7p 的扩增,以及 8p 和 13q 上的混合事件类型。我们还观察到了特定部位和泛癌 (跨越 17p) 的 cnLOH,这些模式以前没有得到全面描述。这些 cnLOH 事件的识别阐明了肿瘤抑制因子和多击途径致癌。我们还对比了从 AI 和总强度衍生的 SCNAs 推断出的图谱,并提出了一种自动程序,以改进和调整 TCGA 中的 SCNAs,用于那些高水平非整倍体使基线强度识别变得模糊的情况。我们的研究结果支持探索其他方法来进行稳健的自动推断程序,并在 TCGA 中协助经验发现。