Mu Xuechen, Huang Zhenyu, Chen Qiufen, Shi Bocheng, Xu Long, Xu Ying, Zhang Kai
School of Mathematics, Jilin University, Changchun 130012, China.
School of Medicine, Southern University of Science and Technology, Shenzhen 518055, China.
Int J Mol Sci. 2024 Dec 2;25(23):12942. doi: 10.3390/ijms252312942.
Enhancers are short genomic segments located in non-coding regions of the genome that play a critical role in regulating the expression of target genes. Despite their importance in transcriptional regulation, effective methods for classifying enhancer categories and regulatory strengths remain limited. To address this challenge, we propose a novel end-to-end deep learning architecture named DeepEnhancerPPO. The model integrates ResNet and Transformer modules to extract local, hierarchical, and long-range contextual features. Following feature fusion, we employ Proximal Policy Optimization (PPO), a reinforcement learning technique, to reduce the dimensionality of the fused features, retaining the most relevant features for downstream classification tasks. We evaluate the performance of DeepEnhancerPPO from multiple perspectives, including ablation analysis, independent tests, assessment of PPO's contribution to performance enhancement, and interpretability of the classification results. Each module positively contributes to the overall performance, with ResNet and PPO being the most significant contributors. Overall, DeepEnhancerPPO demonstrates superior performance on independent datasets compared to other models, outperforming the second-best model by 6.7% in accuracy for enhancer category classification. The model consistently ranks among the top five classifiers out of 25 for enhancer strength classification without requiring re-optimization of the hyperparameters and ranks as the second-best when the hyperparameters are refined. This indicates that the DeepEnhancerPPO framework is highly robust for enhancer classification. Additionally, the incorporation of PPO enhances the interpretability of the classification results.
增强子是位于基因组非编码区域的短基因组片段,在调节靶基因表达中起关键作用。尽管它们在转录调控中很重要,但用于分类增强子类别和调控强度的有效方法仍然有限。为应对这一挑战,我们提出了一种名为DeepEnhancerPPO的新型端到端深度学习架构。该模型整合了ResNet和Transformer模块,以提取局部、分层和长程上下文特征。在特征融合之后,我们采用近端策略优化(PPO)这一强化学习技术来降低融合特征的维度,保留对下游分类任务最相关的特征。我们从多个角度评估DeepEnhancerPPO的性能,包括消融分析、独立测试、评估PPO对性能提升的贡献以及分类结果的可解释性。每个模块都对整体性能有积极贡献,其中ResNet和PPO的贡献最为显著。总体而言,与其他模型相比,DeepEnhancerPPO在独立数据集上表现出卓越的性能,在增强子类别分类准确率方面比第二好的模型高出6.7%。在增强子强度分类中,该模型在25个分类器中始终排名前五,无需重新优化超参数,在优化超参数时排名第二。这表明DeepEnhancerPPO框架在增强子分类方面具有高度鲁棒性。此外,PPO的纳入增强了分类结果的可解释性。