Suppr超能文献

TECM-ChI:一种基于TECM网络的染色质相互作用预测方法。

TECM-ChI: A TECM network-based method for chromatin interaction prediction.

作者信息

Chen Yu, Bao Chengfeng, Wang Gang, Sheng Guojun

机构信息

College of Computer and Control Engineering, Northeast Forestry University, Hexing Road 26, 150040 Heilongjiang Province, China.

College of Computer and Control Engineering, Northeast Forestry University, Hexing Road 26, 150040 Heilongjiang Province, China.

出版信息

Gene. 2025 Sep 15;965:149656. doi: 10.1016/j.gene.2025.149656. Epub 2025 Jul 11.

Abstract

Chromatin interactions refer to regulatory relationships formed between chromatin regions through physical contact or spatial proximity, playing a crucial role in genome function, structure, and the development of diseases. In cancer research, for example, thinking of chromatin as a gel can help explain the spread of cancer. Traditional experimental methods, such as Hi-C and ChIA-PET, are costly, time-consuming, and applicable to only a limited number of cell lines. Increasing evidence shows that DNA sequences and genomic features (CTCF motifs, sequence conservation, chromatin-associated proteins e.g.) are essential predictors of chromatin interactions. However, existing computational methods based on these features suffer from data imbalance and low prediction accuracy, which limits their broader application in biomedical research. To address this, we proposes an entirely new model to investigate the existence of chromatin interactions based on DNA sequences and genomic features, called TECM-ChI. In this model, we first design the FCR (Forward Combine Reverse) method to balance the positive and negative samples in the K562, IMR90, and GM12878 datasets to achieve a 1:1 ratio. Additionally, to fully extract meaningful information from the gene sequences, we develop a preprocessing Three-Encoding module that uses three encoding methods to concatenate each nucleotide into a 45-dimensional vector. Next, we propose the CMANet network model, which combines multi-layer convolution with multiple attention mechanisms. CMANet effectively extracts local features within sequence information and enhances focus on key regions, improving the ability to recognize chromatin interactions. To evaluate TECM-ChI's effectiveness, we conducted model variant experiments, loss performance analysis, and comparative analysis with existing computational methods across three cell lines. Experimental results demonstrate that, compared to the current best models, TECM-ChI achieves accuracy improvements of 4.68 %, 1.31 %, and 2.41 % on the K562, IMR90, and GM12878 datasets, respectively, proving its effectiveness and generalization ability in predicting chromatin interactions. The source code for TECM-ChI is available at https://github.com/Fated-2/TECM-ChI.git.

摘要

染色质相互作用是指染色质区域之间通过物理接触或空间接近形成的调控关系,在基因组功能、结构及疾病发展过程中发挥着关键作用。例如,在癌症研究中,将染色质视为一种凝胶有助于解释癌症的扩散。传统实验方法,如Hi-C和ChIA-PET,成本高昂、耗时且仅适用于有限数量的细胞系。越来越多的证据表明,DNA序列和基因组特征(如CTCF基序、序列保守性、染色质相关蛋白等)是染色质相互作用的重要预测指标。然而,基于这些特征的现有计算方法存在数据不平衡和预测准确率低的问题,这限制了它们在生物医学研究中的广泛应用。为解决这一问题,我们提出了一种全新的基于DNA序列和基因组特征来研究染色质相互作用存在情况的模型,称为TECM-ChI。在该模型中,我们首先设计了FCR(正向合并反向)方法来平衡K562、IMR90和GM12878数据集中的正负样本,使其达到1:1的比例。此外,为了从基因序列中充分提取有意义的信息,我们开发了一个预处理三编码模块,该模块使用三种编码方法将每个核苷酸连接成一个45维向量。接下来,我们提出了CMANet网络模型,它将多层卷积与多种注意力机制相结合。CMANet有效地提取序列信息中的局部特征,并增强对关键区域的关注,提高识别染色质相互作用的能力。为评估TECM-ChI的有效性,我们在三种细胞系上进行了模型变体实验、损失性能分析以及与现有计算方法的对比分析。实验结果表明,与当前最佳模型相比,TECM-ChI在K562、IMR90和GM12878数据集上的准确率分别提高了4.68%、1.31%和2.41%,证明了其在预测染色质相互作用方面的有效性和泛化能力。TECM-ChI的源代码可在https://github.com/Fated-2/TECM-ChI.git获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验