Suppr超能文献

基于机器学习的CTCF环锚预测

Prediction of CTCF loop anchor based on machine learning.

作者信息

Zhang Xiao, Zhu Wen, Sun Huimin, Ding Yijie, Liu Li

机构信息

School of Mathematics and Statistics, Hainan Normal University, Haikou, China.

Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.

出版信息

Front Genet. 2023 Apr 3;14:1181956. doi: 10.3389/fgene.2023.1181956. eCollection 2023.

Abstract

Various activities in biological cells are affected by three-dimensional genome structure. The insulators play an important role in the organization of higher-order structure. CTCF is a representative of mammalian insulators, which can produce barriers to prevent the continuous extrusion of chromatin loop. As a multifunctional protein, CTCF has tens of thousands of binding sites in the genome, but only a portion of them can be used as anchors of chromatin loops. It is still unclear how cells select the anchor in the process of chromatin looping. In this paper, a comparative analysis is performed to investigate the sequence preference and binding strength of anchor and non-anchor CTCF binding sites. Furthermore, a machine learning model based on the CTCF binding intensity and DNA sequence is proposed to predict which CTCF sites can form chromatin loop anchors. The accuracy of the machine learning model that we constructed for predicting the anchor of the chromatin loop mediated by CTCF reached 0.8646. And we find that the formation of loop anchor is mainly influenced by the CTCF binding strength and binding pattern (which can be interpreted as the binding of different zinc fingers). In conclusion, our results suggest that The CTCF core motif and it's flanking sequence may be responsible for the binding specificity. This work contributes to understanding the mechanism of loop anchor selection and provides a reference for the prediction of CTCF-mediated chromatin loops.

摘要

生物细胞中的各种活动受三维基因组结构的影响。绝缘子在高阶结构的组织中起重要作用。CTCF是哺乳动物绝缘子的代表,它可以产生屏障以阻止染色质环的持续挤压。作为一种多功能蛋白质,CTCF在基因组中有数万个结合位点,但其中只有一部分可作为染色质环的锚点。目前尚不清楚细胞在染色质环化过程中如何选择锚点。本文进行了比较分析,以研究锚定和非锚定CTCF结合位点的序列偏好和结合强度。此外,提出了一种基于CTCF结合强度和DNA序列的机器学习模型,以预测哪些CTCF位点可以形成染色质环锚点。我们构建的用于预测CTCF介导的染色质环锚点的机器学习模型的准确率达到了0.8646。并且我们发现环锚的形成主要受CTCF结合强度和结合模式(可解释为不同锌指的结合)的影响。总之,我们的结果表明CTCF核心基序及其侧翼序列可能负责结合特异性。这项工作有助于理解环锚选择的机制,并为CTCF介导的染色质环的预测提供参考。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c294/10106609/12fe5328b87f/fgene-14-1181956-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验