Center for Informational Biology at University of Electronic Science and Technology of China, China.
Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, China.
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab486.
Post-translational modification (PTM) refers to the covalent and enzymatic modification of proteins after protein biosynthesis, which orchestrates a variety of biological processes. Detecting PTM sites in proteome scale is one of the key steps to in-depth understanding their regulation mechanisms. In this study, we presented an integrated method based on eXtreme Gradient Boosting (XGBoost), called iRice-MS, to identify 2-hydroxyisobutyrylation, crotonylation, malonylation, ubiquitination, succinylation and acetylation in rice. For each PTM-specific model, we adopted eight feature encoding schemes, including sequence-based features, physicochemical property-based features and spatial mapping information-based features. The optimal feature set was identified from each encoding, and their respective models were established. Extensive experimental results show that iRice-MS always display excellent performance on 5-fold cross-validation and independent dataset test. In addition, our novel approach provides the superiority to other existing tools in terms of AUC value. Based on the proposed model, a web server named iRice-MS was established and is freely accessible at http://lin-group.cn/server/iRice-MS.
翻译后修饰(PTM)是指蛋白质生物合成后蛋白质的共价和酶促修饰,它协调着多种生物过程。在蛋白质组范围内检测 PTM 位点是深入了解其调控机制的关键步骤之一。在这项研究中,我们提出了一种基于极端梯度提升(XGBoost)的集成方法,称为 iRice-MS,用于鉴定水稻中的 2-羟异丁酰化、丁酰化、丙二酰化、泛素化、琥珀酰化和乙酰化。对于每种 PTM 特异性模型,我们采用了八种特征编码方案,包括基于序列的特征、基于理化性质的特征和基于空间映射信息的特征。从每个编码中确定了最佳特征集,并建立了各自的模型。广泛的实验结果表明,iRice-MS 在 5 折交叉验证和独立数据集测试中始终表现出优异的性能。此外,与其他现有工具相比,我们的新方法在 AUC 值方面具有优势。基于所提出的模型,建立了一个名为 iRice-MS 的网络服务器,并可在 http://lin-group.cn/server/iRice-MS 上免费访问。