Jiang Yuexu, Wang Duolin, Yao Yifu, Eubel Holger, Künzler Patrick, Møller Ian Max, Xu Dong
Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, Columbia, MO, USA.
Institute of Plant Genetics, Leibniz University Hannover, Hannover, Germany.
Comput Struct Biotechnol J. 2021 Aug 18;19:4825-4839. doi: 10.1016/j.csbj.2021.08.027. eCollection 2021.
Prediction of protein localization plays an important role in understanding protein function and mechanisms. In this paper, we propose a general deep learning-based localization prediction framework, MULocDeep, which can predict multiple localizations of a protein at both subcellular and suborganellar levels. We collected a dataset with 44 suborganellar localization annotations in 10 major subcellular compartments-the most comprehensive suborganelle localization dataset to date. We also experimentally generated an independent dataset of mitochondrial proteins in cell cultures, tubers, and roots and made this dataset publicly available. Evaluations using the above datasets show that overall, MULocDeep outperforms other major methods at both subcellular and suborganellar levels. Furthermore, MULocDeep assesses each amino acid's contribution to localization, which provides insights into the mechanism of protein sorting and localization motifs. A web server can be accessed at http://mu-loc.org.
蛋白质定位预测在理解蛋白质功能和机制方面发挥着重要作用。在本文中,我们提出了一个基于深度学习的通用定位预测框架MULocDeep,它可以在亚细胞和亚细胞器水平上预测蛋白质的多个定位。我们收集了一个包含10个主要亚细胞区室中44种亚细胞器定位注释的数据集——这是迄今为止最全面的亚细胞器定位数据集。我们还通过实验生成了一个在细胞培养物、块茎和根中的线粒体蛋白质独立数据集,并将该数据集公开。使用上述数据集进行的评估表明,总体而言,MULocDeep在亚细胞和亚细胞器水平上均优于其他主要方法。此外,MULocDeep评估了每个氨基酸对定位的贡献,这为蛋白质分选机制和定位基序提供了见解。可通过http://mu-loc.org访问网络服务器。