Bao Chengfeng, Wang Gang, Sheng Guojun, Chen Yu
College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China.
Int J Mol Sci. 2025 Aug 20;26(16):8035. doi: 10.3390/ijms26168035.
Enhancer-promoter interactions (EPIs) play a key role in epigenetic regulation of gene expression, dominating cellular identity and functional diversity. Dissecting these interactions is crucial for understanding transcriptional regulatory networks and their significance in cell differentiation, development, and disease. Here, we propose a novel deep learning framework, EPIFBMC (Enhancer-Promoter Interaction prediction with FBMC network) that leverages DNA sequence and genomic features for accurate EPI prediction. The FBMC network consists of three key modules: the Four-Encoding module first encodes the DNA sequence in multiple dimensions to extract key sequence information; then the BESL (Balanced Ensemble Subset Learning) adopts an integrated subset learning strategy to optimize the feature-learning process of positive and negative samples; finally, the MCANet module completes the training of EPI prediction based on a Multi-channel Network. We evaluated EPIFBMC on three cell line datasets (HeLa, IMR90, and NHEK), and validated its generalizability across three independent datasets (K562, GM12878, HUVEC) through cross-cell-line experiments, comparing favorably with state-of-the-art methods. Notably, EPIFBMC balances genomic feature richness and computational complexity, significantly accelerating training speed. Ablation studies identified two key DNA sequence features-positional conservation and positional specificity score-which showed critical predictive value across a benchmark dataset of six diverse cell lines. The computational testing show that EPIFBMC shows excellent performance in the EPI prediction task, providing a powerful tool for decoding gene regulatory networks. It is believed that it will have important application prospects in developmental biology, disease mechanism research, and therapeutic target discovery.
增强子-启动子相互作用(EPIs)在基因表达的表观遗传调控中起关键作用,主导细胞身份和功能多样性。剖析这些相互作用对于理解转录调控网络及其在细胞分化、发育和疾病中的意义至关重要。在此,我们提出了一种新颖的深度学习框架EPIFBMC(基于FBMC网络的增强子-启动子相互作用预测),该框架利用DNA序列和基因组特征进行准确的EPI预测。FBMC网络由三个关键模块组成:四重编码模块首先对DNA序列进行多维度编码以提取关键序列信息;然后,平衡集成子集学习(BESL)采用集成子集学习策略来优化正样本和负样本的特征学习过程;最后,MCANet模块基于多通道网络完成EPI预测的训练。我们在三个细胞系数据集(HeLa、IMR90和NHEK)上对EPIFBMC进行了评估,并通过跨细胞系实验在三个独立数据集(K562、GM12878、HUVEC)上验证了其通用性,与现有最先进方法相比具有优势。值得注意的是,EPIFBMC平衡了基因组特征丰富性和计算复杂性,显著加快了训练速度。消融研究确定了两个关键的DNA序列特征——位置保守性和位置特异性得分——它们在六个不同细胞系的基准数据集上显示出关键的预测价值。计算测试表明,EPIFBMC在EPI预测任务中表现出色,为解码基因调控网络提供了一个强大的工具。相信它将在发育生物学、疾病机制研究和治疗靶点发现中具有重要的应用前景。