Suppr超能文献

CpG 岛定义和 T2T-YAO 基因组的甲基化图谱。

CpG Island Definition and Methylation Mapping of the T2T-YAO Genome.

机构信息

College of Computer Science, Sichuan University, Chengdu 610065, China.

Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China.

出版信息

Genomics Proteomics Bioinformatics. 2024 Jul 3;22(2). doi: 10.1093/gpbjnl/qzae009.

Abstract

Precisely defining and mapping all cytosine (C) positions and their clusters, known as CpG islands (CGIs), as well as their methylation status, are pivotal for genome-wide epigenetic studies, especially when population-centric reference genomes are ready for timely application. Here, we first align the two high-quality reference genomes, T2T-YAO and T2T-CHM13, from different ethnic backgrounds in a base-by-base fashion and compute their genome-wide density-defined and position-defined CGIs. Second, by mapping some representative genome-wide methylation data from selected organs onto the two genomes, we find that there are about 4.7%-5.8% sequence divergency of variable categories depending on quality cutoffs. Genes among the divergent sequences are mostly associated with neurological functions. Moreover, CGIs associated with the divergent sequences are significantly different with respect to CpG density and observed CpG/expected CpG (O/E) ratio between the two genomes. Finally, we find that the T2T-YAO genome not only has a greater CpG coverage than that of the T2T-CHM13 genome when whole-genome bisulfite sequencing (WGBS) data from the European and American populations are mapped to each reference, but also shows more hyper-methylated CpG sites as compared to the T2T-CHM13 genome. Our study suggests that future genome-wide epigenetic studies of the Chinese populations rely on both acquisition of high-quality methylation data and subsequent precision CGI mapping based on the Chinese T2T reference.

摘要

精确定义和绘制所有胞嘧啶 (C) 位置及其簇,称为 CpG 岛 (CGI),以及它们的甲基化状态,对于全基因组表观遗传研究至关重要,特别是当以人群为中心的参考基因组准备好及时应用时。在这里,我们首先以碱基对的方式对齐来自不同背景的两个高质量参考基因组 T2T-YAO 和 T2T-CHM13,并计算它们的全基因组密度定义和位置定义的 CGI。其次,通过将一些来自选定器官的代表性全基因组甲基化数据映射到两个基因组上,我们发现根据质量截止值,有大约 4.7%-5.8%的可变类别的序列差异。差异序列中的基因主要与神经功能有关。此外,与差异序列相关的 CGI 在两个基因组之间的 CpG 密度和观察到的 CpG/预期 CpG (O/E) 比方面存在显著差异。最后,我们发现当将来自欧洲和美洲人群的全基因组亚硫酸氢盐测序 (WGBS) 数据映射到每个参考基因组时,T2T-YAO 基因组不仅比 T2T-CHM13 基因组具有更高的 CpG 覆盖率,而且与 T2T-CHM13 基因组相比,它还显示出更多的超甲基化 CpG 位点。我们的研究表明,未来对中国人群的全基因组表观遗传研究既依赖于高质量甲基化数据的获取,也依赖于基于中国 T2T 参考的后续精确 CGI 映射。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验