Suppr超能文献

哺乳动物内源性G-四链体的保守性揭示了它们与复杂性状的关联。

Mammalian conservation of endogenous G-quadruplex reveals their associations with complex traits.

作者信息

Zhang Ze-Hao, Wang Zi-Yan, Li Cong-Hui, Qian Sheng Hu, Zhang Wen, Chen Zhen-Xia

机构信息

Hubei Hongshan Laboratory, Hubei Key Laboratory of Metabolic Abnormalities and Vascular Aging, Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, College of Biomedicine and Health, Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan, 430070, PR China.

College of Informatics, Huazhong Agricultural University, Wuhan, 430070, PR China.

出版信息

Genome Biol. 2025 Sep 1;26(1):262. doi: 10.1186/s13059-025-03750-z.

Abstract

BACKGROUND

DNA G-quadruplexes (G4s) are four-stranded DNA structures. Endogenous G-quadruplexes (eG4s) have been identified as pivotal regulatory elements for gene expression in the human genome. The measurement of evolutionary conservation can be employed to ascertain the functional relevance of putative regulatory elements. However, the evolutionary profiles of human eG4s remain largely unknown.

RESULTS

Here, we construct mammalian evolutionary profiles of human eG4s based on a comprehensive reference annotation of human eG4s from the integration of the eG4 database EndoQuad covering 41 human cell lines and our home-made G4 CUT&Tag data covering seven cell lines. We find that transposable elements contribute substantially to the evolutionary spread of primate-specific eG4s. A total of 92,910 highly conserved human eG4s were identified under mammalian constraint. By developing and utilizing the eG4 prediction tool eG4finder, which is based on a large language model, we verify the high structural conservation of highly conserved eG4s. The enrichment of highly conserved eG4s in developmental and aging pathways highlights their potential significance in key biological processes. Notably, highly conserved eG4s exhibit higher regulatory potential, regulatory activity and affinity for transcription factors. We demonstrate that highly conserved eG4s are the most powerful transcriptional activation elements in the total eG4 collection. Meanwhile, trait-associated variants and variants affecting the expression of high phenotypic severity genes are most enriched in highly conserved eG4s.

CONCLUSIONS

Our study highlights the important regulatory functions and close association with complex human traits of human eG4s that are highly conserved in the mammalian lineage.

摘要

背景

DNA G-四链体(G4s)是四链DNA结构。内源性G-四链体(eG4s)已被确定为人类基因组中基因表达的关键调控元件。进化保守性的测量可用于确定假定调控元件的功能相关性。然而,人类eG4s的进化特征在很大程度上仍不清楚。

结果

在此,我们基于对人类eG4s的全面参考注释构建了人类eG4s的哺乳动物进化图谱,该注释整合了涵盖41个人类细胞系的eG4数据库EndoQuad和我们自制的涵盖7个细胞系的G4 CUT&Tag数据。我们发现转座元件对灵长类特异性eG4s的进化传播有很大贡献。在哺乳动物的限制下,共鉴定出92,910个高度保守的人类eG4s。通过开发和利用基于大语言模型的eG4预测工具eG4finder,我们验证了高度保守的eG4s的高结构保守性。高度保守的eG4s在发育和衰老途径中的富集突出了它们在关键生物学过程中的潜在重要性。值得注意的是,高度保守的eG4s对转录因子表现出更高的调控潜力、调控活性和亲和力。我们证明高度保守的eG4s是整个eG4集合中最强大的转录激活元件。同时,与性状相关的变异和影响高表型严重程度基因表达的变异在高度保守的eG4s中最为富集。

结论

我们的研究突出了在哺乳动物谱系中高度保守的人类eG4s的重要调控功能以及与复杂人类性状的密切关联。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验