School of Science, Dalian Maritime University, Dalian, 116026, China.
Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC, 3800, Australia; Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, 3800, Australia.
Anal Biochem. 2020 Mar 15;593:113592. doi: 10.1016/j.ab.2020.113592. Epub 2020 Jan 20.
Lysine succinylation is an important type of protein post-translational modification and plays a key role in regulating protein function and structural changes. The mechanism and function of succinylation have not been clarified. The key to better understanding the precise mechanism and functional role of succinylation is the identification of lysine succinylation sites. However, conventional experimental methods for succinylation identification are often expensive, time-consuming, and labor-intensive. Therefore, the new development of computational approaches to effectively identify lysine succinylation sites from sequence data is much needed. In this study, we proposed a novel predictor for lysine succinylation identification, Inspector, which was developed by using the random forest algorithm combined with a variety of sequence-based feature-encoding schemes. Edited nearest-neighbor undersampling method and adaptive synthetic oversampling approach were employed to solve dataset imbalance, and a two-step feature-selection strategy was applied to optimize the feature set for training the accuracy of the prediction model. Empirical studies on performance comparison with existing tools showed that Inspector was able to achieve competitive predictive performance for distinguishing lysine succinylation sites.
赖氨酸琥珀酰化是一种重要的蛋白质翻译后修饰类型,在调节蛋白质功能和结构变化方面起着关键作用。琥珀酰化的机制和功能尚未阐明。更好地理解琥珀酰化的精确机制和功能作用的关键是鉴定赖氨酸琥珀酰化位点。然而,传统的琥珀酰化鉴定实验方法通常昂贵、耗时且劳动强度大。因此,非常需要从序列数据中有效识别赖氨酸琥珀酰化位点的计算方法的新发展。在这项研究中,我们提出了一种新的赖氨酸琥珀酰化鉴定预测器 Inspector,它是通过使用随机森林算法结合多种基于序列的特征编码方案开发的。编辑最近邻欠采样方法和自适应综合过采样方法被用来解决数据集不平衡的问题,并且应用两步特征选择策略来优化特征集,以提高预测模型的准确性。与现有工具的性能比较的实证研究表明,Inspector 能够实现区分赖氨酸琥珀酰化位点的有竞争力的预测性能。