Suppr超能文献

缩小差距,并利用CHM13-T2T改进体细胞结构变异分析和基准测试。

Closing the gaps, and improving somatic structural variant analysis and benchmarking using CHM13-T2T.

作者信息

Paulin Luis F, Fan Jeremy, O'Neill Kieran, Pleasance Erin, Porter Vanessa L, Jones Steven J M, Sedlazeck Fritz J

机构信息

Human Genome Sequencing Center Baylor College of Medicine, Houston, Texas 77030, USA.

Canada's Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, British Columbia V5Z 1L3, Canada.

出版信息

Genome Res. 2025 Apr 14;35(4):621-631. doi: 10.1101/gr.279352.124.

Abstract

The complexities of cancer genomes are becoming more easily interpreted due to advancements in sequencing technologies and improved bioinformatic analysis. Structural variants (SVs) represent an important subset of somatic events in tumors. While the detection of SVs has been markedly improved by the development of long-read sequencing, somatic variant identification and annotation remain challenging. We hypothesized that the use of a completed human reference genome (CHM13-T2T) would improve somatic SV calling. Our findings in a tumor-normal matched benchmark sample and three patient samples show that the CHM13-T2T improves SV detection accuracy compared to GRCh38 with a notable reduction in false-positive calls, and thus supports improved prioritization. We also overcame the lack of annotation resources for CHM13-T2T by lifting over CHM13-T2T-aligned reads to the GRCh38 genome, therefore combining both improved alignment and advanced annotations. In this process, we assessed the current SV benchmark set for COLO829/COLO829BL across four replicates sequenced at different centers with different long-read technologies. We discovered instability of this cell line across these replicates; 346 SVs (1.13%) were only discoverable in a single replicate. We identify 54 somatic SVs, which appear to be stable as they are consistently present across the four replicates. As such, we propose this consensus set as an updated benchmark for somatic SV calling and include both GRCh38 and CHM13-T2T coordinates in our benchmark. Our work demonstrates new approaches to optimize somatic SV detection in cancer with potential improvements in other genetic diseases.

摘要

由于测序技术的进步和生物信息分析的改进,癌症基因组的复杂性正变得更容易解读。结构变异(SVs)是肿瘤体细胞事件的一个重要子集。虽然长读长测序的发展显著改善了SVs的检测,但体细胞变异的识别和注释仍然具有挑战性。我们假设使用完整的人类参考基因组(CHM13-T2T)将提高体细胞SVs的检测能力。我们在肿瘤-正常匹配的基准样本和三个患者样本中的研究结果表明,与GRCh38相比,CHM13-T2T提高了SVs检测的准确性,显著减少了假阳性调用,从而支持了更好的优先级排序。我们还通过将与CHM13-T2T比对的 reads 转移到GRCh38基因组,克服了CHM13-T2T注释资源的缺乏,因此结合了改进的比对和先进的注释。在此过程中,我们评估了在不同中心使用不同长读长技术测序的四个重复样本中COLO829/COLO829BL的当前SV基准集。我们发现该细胞系在这些重复样本中存在不稳定性;346个SVs(1.13%)仅在一个重复样本中可检测到。我们识别出54个体细胞SVs,它们在四个重复样本中始终存在,似乎是稳定的。因此,我们提出这个共识集作为体细胞SVs检测的更新基准,并在我们的基准中包括GRCh38和CHM13-T2T坐标。我们的工作展示了优化癌症体细胞SVs检测的新方法,并可能在其他遗传疾病中得到改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e3b/12047239/48fd619eed8a/621f01.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验