New York Genome Center, New York, NY, USA.
Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA.
Nat Genet. 2023 Dec;55(12):2139-2148. doi: 10.1038/s41588-023-01540-6. Epub 2023 Nov 9.
Short-read sequencing is the workhorse of cancer genomics yet is thought to miss many structural variants (SVs), particularly large chromosomal alterations. To characterize missing SVs in short-read whole genomes, we analyzed 'loose ends'-local violations of mass balance between adjacent DNA segments. In the landscape of loose ends across 1,330 high-purity cancer whole genomes, most large (>10-kb) clonal SVs were fully resolved by short reads in the 87% of the human genome where copy number could be reliably measured. Some loose ends represent neotelomeres, which we propose as a hallmark of the alternative lengthening of telomeres phenotype. These pan-cancer findings were confirmed by long-molecule profiles of 38 breast cancer and melanoma cases. Our results indicate that aberrant homologous recombination is unlikely to drive the majority of large cancer SVs. Furthermore, analysis of mass balance in short-read whole genome data provides a surprisingly complete picture of cancer chromosomal structure.
短读测序是癌症基因组学的主力军,但据认为会错过许多结构变异(SV),尤其是大型染色体改变。为了描述短读全基因组中缺失的 SV,我们分析了局部 DNA 片段之间质量平衡的“松散末端”——局部违反。在 1330 个高纯度癌症全基因组的松散末端图谱中,在可可靠测量拷贝数的人类基因组 87%的区域中,大多数 (>10kb) 克隆 SV 都可以通过短读完全解决。一些松散末端代表新端粒,我们将其作为端粒延长表型的标志。这些泛癌研究结果得到了 38 例乳腺癌和黑色素瘤病例的长分子谱的证实。我们的结果表明,异常同源重组不太可能驱动大多数大型癌症 SV。此外,短读全基因组数据中质量平衡的分析提供了癌症染色体结构的惊人完整画面。