Zhang Ao, Jia Jianhua, Sun Mingwei, Wei Xin
School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China.
Business School, Jiangxi Institute of Fashion Technology, Nanchang, China.
Front Genet. 2025 Jul 23;16:1614222. doi: 10.3389/fgene.2025.1614222. eCollection 2025.
Enhancer-promoter interactions (EPIs) play a vital role in the regulation of gene expression. Although traditional wet-lab methods provide valuable insights into EPIs, they are often constrained by high costs and limited scalability. As a result, the development of efficient computational models has become essential. However, many current deep learning and machine learning approaches utilize simplistic feature fusion strategies, such as direct averaging or concatenation, which fail to effectively model complex relationships and dynamic importance across features. This often results in suboptimal performance in challenging biological contexts.
To address these limitations, we propose a deep learning model named EPI-DynFusion. This model begins by encoding DNA sequences using pre-trained DNA embeddings and extracting local features through convolutional neural networks (CNNs). It then integrates a Transformer and Bidirectional Gated Recurrent Unit (BiGRU) architecture with a Dynamic Feature Fusion mechanism to adaptively learn deep dependencies among features. Furthermore, we incorporate the Convolutional Block Attention Module (CBAM) to enhance the model's ability to focus on informative regions. Based on this core architecture, we develop two variants: EPI-DynFusion-gen, a general model, and EPI-DynFusion-best, a fine-tuned version for cell line-specific data.
We evaluated the performance of our models across six benchmark cell lines. The average area under the receiver operating characteristic curve (AUROC) scores achieved by the specific, generic, and best models were 94.8%, 95.0%, and 96.2%, respectively. The average area under the precision-recall curve (AUPR) scores were 81.2%, 71.1%, and 83.3%, respectively, demonstrating the superior performance of the fine-tuned model in the precision-recall space. These results confirm that the proposed fusion strategies and attention mechanisms contribute to significant improvements in performance.
In conclusion, EPI-DynFusion presents a robust and scalable framework for predicting enhancer-promoter interactions solely based on DNA sequence information. By addressing the limitations of conventional fusion techniques and incorporating attention mechanisms alongside sequence modeling, our method achieves state-of-the-art performance while enhancing the interpretability and generalizability of enhancer-promoter interaction prediction tasks.
增强子-启动子相互作用(EPI)在基因表达调控中起着至关重要的作用。尽管传统的湿实验室方法为EPI提供了有价值的见解,但它们往往受到高成本和有限可扩展性的限制。因此,开发高效的计算模型变得至关重要。然而,许多当前的深度学习和机器学习方法采用简单的特征融合策略,如直接平均或拼接,这无法有效地对特征之间的复杂关系和动态重要性进行建模。这通常导致在具有挑战性的生物学背景下性能次优。
为了解决这些限制,我们提出了一种名为EPI-DynFusion的深度学习模型。该模型首先使用预训练的DNA嵌入对DNA序列进行编码,并通过卷积神经网络(CNN)提取局部特征。然后,它将Transformer和双向门控循环单元(BiGRU)架构与动态特征融合机制相结合,以自适应地学习特征之间的深度依赖关系。此外,我们引入了卷积块注意力模块(CBAM)来增强模型关注信息区域的能力。基于此核心架构,我们开发了两个变体:EPI-DynFusion-gen,一个通用模型;以及EPI-DynFusion-best,一个针对细胞系特异性数据的微调版本。
我们在六个基准细胞系上评估了我们模型的性能。特定模型、通用模型和最佳模型在接收器操作特征曲线(AUROC)下的平均面积得分分别为94.8%、95.0%和96.2%。精确召回曲线(AUPR)下的平均面积得分分别为81.2%、71.1%和83.3%,这表明微调模型在精确召回空间中具有卓越的性能。这些结果证实,所提出的融合策略和注意力机制有助于显著提高性能。
总之,EPI-DynFusion提出了一个强大且可扩展的框架,用于仅基于DNA序列信息预测增强子-启动子相互作用。通过解决传统融合技术的局限性,并在序列建模的同时纳入注意力机制,我们的方法在提高增强子-启动子相互作用预测任务的可解释性和通用性的同时,实现了当前最优的性能。