Yu Lezheng, Zhang Yonglin, Xue Li, Liu Fengjuan, Jing Runyu, Luo Jiesi
School of Chemistry and Materials Science, Guizhou Education University, Guiyang 550018, Guizhou, China.
Basic Medical College, Southwest Medical University, Luzhou 646000, Sichuan, China.
Comput Struct Biotechnol J. 2023 Sep 29;21:4836-4848. doi: 10.1016/j.csbj.2023.09.036. eCollection 2023.
Autophagy is a primary mechanism for maintaining cellular homeostasis. The synergistic actions of autophagy-related (ATG) proteins strictly regulate the whole autophagic process. Therefore, accurate identification of ATGs is a first and critical step to reveal the molecular mechanism underlying the regulation of autophagy. Current computational methods can predict ATGs from primary protein sequences, but owing to the limitations of algorithms, significant room for improvement still exists. In this research, we propose EnsembleDL-ATG, an ensemble deep learning framework that aggregates multiple deep learning models to predict ATGs from protein sequence and evolutionary information. We first evaluated the performance of individual networks for various feature descriptors to identify the most promising models. Then, we explored all possible combinations of independent models to select the most effective ensemble architecture. The final framework was built and maintained by an organization of four different deep learning models. Experimental results show that our proposed method achieves a prediction accuracy of 94.5 % and of 0.890, which are nearly 4 % and 0.08 higher than ATGPred-FL, respectively. Overall, EnsembleDL-ATG is the first ATG machine learning predictor based on ensemble deep learning. The benchmark data and code utilized in this study can be accessed for free at https://github.com/jingry/autoBioSeqpy/tree/2.0/examples/EnsembleDL-ATG.
自噬是维持细胞稳态的主要机制。自噬相关(ATG)蛋白的协同作用严格调控整个自噬过程。因此,准确识别ATG是揭示自噬调控分子机制的首要关键步骤。目前的计算方法可以从蛋白质一级序列预测ATG,但由于算法的局限性,仍有很大的改进空间。在本研究中,我们提出了EnsembleDL-ATG,这是一种集成深度学习框架,它聚合多个深度学习模型,从蛋白质序列和进化信息中预测ATG。我们首先评估了各个网络对各种特征描述符的性能,以识别最有前景的模型。然后,我们探索了独立模型的所有可能组合,以选择最有效的集成架构。最终框架由四个不同的深度学习模型组成。实验结果表明,我们提出的方法实现了94.5%的预测准确率和0.890的马修斯相关系数,分别比ATGPred-FL高出近4%和0.08。总体而言,EnsembleDL-ATG是首个基于集成深度学习的ATG机器学习预测器。本研究中使用的基准数据和代码可在https://github.com/jingry/autoBioSeqpy/tree/2.0/examples/EnsembleDL-ATG上免费获取。