Suppr超能文献

基因的表达调控与其转录起始位点周围 CpG 密度分布有关。

Expression regulation of genes is linked to their CpG density distributions around transcription start sites.

机构信息

Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.

Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing, China

出版信息

Life Sci Alliance. 2022 May 17;5(9). doi: 10.26508/lsa.202101302. Print 2022 Sep.

Abstract

The CpG dinucleotide and its methylation behaviors play vital roles in gene regulation. Previous studies have divided genes into several categories based on the CpG intensity around transcription starting sites and found that housekeeping genes tend to possess high CpG density, whereas tissue-specific genes are generally characterized by low CpG density. In this study, we investigated how the CpG density distribution of a gene affects its transcription and regulation pattern. Based on the CpG density distribution around transcription starting site, by means of a semi-supervised neural network we designed, which took data augmentation into account, we divided the human genes into three categories, and genes within each cluster shared similar CpG density distribution. Not only sequence properties, these different clusters exhibited distinctly different structural features, regulatory mechanisms, correlation patterns between the expression level and CpG/TpG density, and expression and epigenetic mark variations during tumorigenesis. For instance, the activation of cluster 3 genes relies more on 3D genome reorganization, compared with cluster 1 and 2 genes, whereas cluster 2 genes showed the strongest correlation between gene expression and H3K27me3. Genes exhibiting uncoupled correlation between gene regulation and histone modifications are mainly in cluster 3. These results emphasized that the usage of epigenetic marks in gene regulation is partially rooted in the sequence property of genes such as their CpG density distribution and explained to some extent why the relation between epigenetic marks and gene expression is controversial.

摘要

CpG 二核苷酸及其甲基化行为在基因调控中起着至关重要的作用。先前的研究根据转录起始点周围的 CpG 强度将基因分为几类,发现管家基因通常具有高 CpG 密度,而组织特异性基因通常具有低 CpG 密度。在这项研究中,我们研究了基因的 CpG 密度分布如何影响其转录和调控模式。基于转录起始点周围的 CpG 密度分布,我们通过设计一个半监督的神经网络,考虑了数据增强,将人类基因分为三类,每个聚类中的基因具有相似的 CpG 密度分布。这些不同的聚类不仅在序列特性上,而且在结构特征、调控机制、表达水平与 CpG/TpG 密度之间的相关模式以及肿瘤发生过程中的表达和表观遗传标记变化方面都表现出明显的不同。例如,与聚类 1 和 2 基因相比,聚类 3 基因的激活更多地依赖于 3D 基因组重排,而聚类 2 基因的基因表达与 H3K27me3 之间具有最强的相关性。基因调控和组蛋白修饰之间存在不相关的基因主要在聚类 3 中。这些结果强调了表观遗传标记在基因调控中的使用部分源于基因的序列特性,如 CpG 密度分布,并在一定程度上解释了为什么表观遗传标记与基因表达之间的关系存在争议。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2976/9113945/cf85aa747603/LSA-2021-01302_Fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验