Rashid Md Mamunur, Shatabda Swakkhar, Hasan Md Mehedi, Kurata Hiroyuki
1Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka820-8502, Japan; 2Department of Computer Science and Engineering, United International University, Plot-2, United City, Madani Avenue, Badda, Dhaka, 1212, Bangladesh; 3Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo102-0083, Japan; 4Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka820-8502, Japan.
Curr Genomics. 2020 Apr;21(3):194-203. doi: 10.2174/1389202921666200427210833.
A variety of protein post-translational modifications has been identified that control many cellular functions. Phosphorylation studies in mycobacterial organisms have shown critical importance in diverse biological processes, such as intercellular communication and cell division. Recent technical advances in high-precision mass spectrometry have determined a large number of microbial phosphorylated proteins and phosphorylation sites throughout the proteome analysis. Identification of phosphorylated proteins with specific modified residues through experimentation is often labor-intensive, costly and time-consuming. All these limitations could be overcome through the application of machine learning (ML) approaches. However, only a limited number of computational phosphorylation site prediction tools have been developed so far. This work aims to present a complete survey of the existing ML-predictors for microbial phosphorylation. We cover a variety of important aspects for developing a successful predictor, including operating ML algorithms, feature selection methods, window size, and software utility. Initially, we review the currently available phosphorylation site databases of the microbiome, the state-of-the-art ML approaches, working principles, and their performances. Lastly, we discuss the limitations and future directions of the computational ML methods for the prediction of phosphorylation.
已发现多种蛋白质翻译后修饰可控制许多细胞功能。对分枝杆菌生物体的磷酸化研究表明,其在细胞间通讯和细胞分裂等多种生物过程中至关重要。高精度质谱技术的最新进展已在全蛋白质组分析中确定了大量微生物磷酸化蛋白和磷酸化位点。通过实验鉴定具有特定修饰残基的磷酸化蛋白通常劳动强度大、成本高且耗时。通过应用机器学习(ML)方法可以克服所有这些限制。然而,到目前为止,仅开发了有限数量的计算磷酸化位点预测工具。这项工作旨在全面综述现有的用于微生物磷酸化的ML预测器。我们涵盖了开发成功预测器的各种重要方面,包括操作ML算法、特征选择方法、窗口大小和软件实用性。首先,我们回顾了目前可用的微生物组磷酸化位点数据库、最先进的ML方法、工作原理及其性能。最后,我们讨论了用于磷酸化预测的计算ML方法的局限性和未来方向。