School of Software, Shandong University, Jinan, 250100, Shandong, China.
College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China.
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae083.
Enhancers, noncoding DNA fragments, play a pivotal role in gene regulation, facilitating gene transcription. Identifying enhancers is crucial for understanding genomic regulatory mechanisms, pinpointing key elements and investigating networks governing gene expression and disease-related mechanisms. Existing enhancer identification methods exhibit limitations, prompting the development of our novel multi-input deep learning framework, termed Enhancer-MDLF. Experimental results illustrate that Enhancer-MDLF outperforms the previous method, Enhancer-IF, across eight distinct human cell lines and exhibits superior performance on generic enhancer datasets and enhancer-promoter datasets, affirming the robustness of Enhancer-MDLF. Additionally, we introduce transfer learning to provide an effective and potential solution to address the prediction challenges posed by enhancer specificity. Furthermore, we utilize model interpretation to identify transcription factor binding site motifs that may be associated with enhancer regions, with important implications for facilitating the study of enhancer regulatory mechanisms. The source code is openly accessible at https://github.com/HaoWuLab-Bioinformatics/Enhancer-MDLF.
增强子是非编码 DNA 片段,在基因调控中起着关键作用,促进基因转录。鉴定增强子对于理解基因组调控机制、确定关键元件以及研究控制基因表达和与疾病相关机制的网络至关重要。现有的增强子识别方法存在局限性,因此我们开发了一种新的多输入深度学习框架,称为 Enhancer-MDLF。实验结果表明,在八个不同的人类细胞系中,Enhancer-MDLF 优于先前的方法 Enhancer-IF,并且在通用增强子数据集和增强子-启动子数据集上表现出更好的性能,证实了 Enhancer-MDLF 的稳健性。此外,我们引入迁移学习为解决增强子特异性预测挑战提供了一种有效且有潜力的解决方案。此外,我们利用模型解释来识别可能与增强子区域相关的转录因子结合位点基序,这对于促进增强子调控机制的研究具有重要意义。源代码可在 https://github.com/HaoWuLab-Bioinformatics/Enhancer-MDLF 上公开获取。