Suppr超能文献

使用群体匹配参考基因组增强全外显子组测序数据中的变异检测

Enhancing Variant Calling in Whole-exome Sequencing Data Using Population-matched Reference Genomes.

作者信息

Guo Shuming, Huang Zhuo, Zhang Yanming, He Yukun, Chen Xiangju, Wang Wenjuan, Li Lansheng, Kang Yu, Gao Zhancheng, Yu Jun, Du Zhenglin, Chu Yanan

机构信息

Linfen Clinical Medicine Research Center, LinFen Central Hospital, LinFen 041000, China.

China National Center for Bioinformation, Beijing 100101, China.

出版信息

Genomics Proteomics Bioinformatics. 2024 Dec 3;22(5). doi: 10.1093/gpbjnl/qzae070.

Abstract

Whole-exome sequencing (WES) data are frequently used for cancer diagnosis and genome-wide association studies (GWAS), based on high-coverage read mapping, informative variant calling, and high-quality reference genomes. The center position of the currently used genome assembly, GRCh38, is now challenged by two newly published telomere-to-telomere (T2T) genomes, T2T-CHM13 and T2T-YAO, and it becomes urgent to have a comparative study to test population specificity using the three reference genomes based on real case WES data. Here, we report our analysis along this line for 19 tumor samples collected from Chinese patients. The primary comparison of the exon regions among the three references reveals that the sequences in up to ∼ 1% of target regions in T2T-YAO are widely diversified from GRCh38 and may lead to off-target in sequence capture. However, T2T-YAO still outperforms GRCh38 by obtaining 7.41% of more mapped reads. Due to more reliable read-mapping and closer phylogenetic relationship with the samples than GRCh38, T2T-YAO reduces half of variant calls of clinical significance which are mostly benign, while maintaining sensitivity in identifying pathogenic variants. T2T-YAO also outperforms T2T-CHM13 in reducing calls of Chinese-specific variants. Our findings highlight the critical need for employing population-specific reference genomes in genomic analysis to ensure accurate variant analysis and the significant benefits of tailoring these approaches to the unique genetic background of each ethnic group.

摘要

基于高覆盖度读段比对、信息丰富的变异检测以及高质量的参考基因组,全外显子组测序(WES)数据常用于癌症诊断和全基因组关联研究(GWAS)。目前使用的基因组组装版本GRCh38的中心位置,正受到两个新发表的端粒到端粒(T2T)基因组T2T-CHM13和T2T-YAO的挑战,因此迫切需要进行一项比较研究,以基于真实病例的WES数据,使用这三个参考基因组来测试群体特异性。在此,我们报告了对从中国患者收集的19个肿瘤样本的相关分析。对三个参考基因组中外显子区域的初步比较显示,T2T-YAO中高达约1%的目标区域序列与GRCh38有广泛差异,可能导致序列捕获脱靶。然而,T2T-YAO通过获得多7.41%的比对读段,仍优于GRCh38。由于读段比对更可靠,且与样本的系统发育关系比GRCh38更密切,T2T-YAO减少了一半具有临床意义的变异检测结果,其中大多数是良性的,同时保持了识别致病变异的敏感性。在减少中国特异性变异的检测方面,T2T-YAO也优于T2T-CHM13。我们的研究结果突出了在基因组分析中采用群体特异性参考基因组以确保准确变异分析的迫切需求,以及根据每个种族群体独特的遗传背景定制这些方法的显著益处。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b56/11687947/8d6124310bfc/qzae070f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验