Ju Zhe, Wang Shi-Yun
College of Science, Shenyang Aerospace University, 110136, People's Republic of China.
Comput Biol Chem. 2020 Aug;87:107280. doi: 10.1016/j.compbiolchem.2020.107280. Epub 2020 May 30.
Lysine 2-hydroxyisobutyrylation (K) is a new type of histone mark, which has been found to affect the association between histone and DNA. To better understand the molecular mechanism of K, it is important to identify 2-hydroxyisobutyrylated substrates and their corresponding K sites accurately. In this study, a novel bioinformatics tool named KhibPred is proposed to predict K sites in human HeLa cells. Three kinds of effective features, the composition of k-spaced amino acid pairs, binary encoding and amino acid factors, are incorporated to encode K sites. Moreover, an ensemble support vector machine is employed to overcome the imbalanced problem in the prediction. As illustrated by 10-fold cross-validation, the performance of KhibPred achieves a satisfactory performance with an area under receiver operating characteristic curve of 0.7937. Therefore, KhibPred can be a useful tool for predicting protein K sites. Feature analysis shows that the polarity factor features play significant roles in the prediction of K sites. The conclusions derived from this study might provide useful insights for in-depth investigation into the molecular mechanisms of K.
赖氨酸2-羟基异丁酰化(K)是一种新型的组蛋白标记,已发现其会影响组蛋白与DNA之间的关联。为了更好地理解K的分子机制,准确鉴定2-羟基异丁酰化底物及其相应的K位点非常重要。在本研究中,提出了一种名为KhibPred的新型生物信息学工具来预测人类HeLa细胞中的K位点。纳入了三种有效特征,即k间隔氨基酸对的组成、二元编码和氨基酸因子,用于对K位点进行编码。此外,采用了集成支持向量机来克服预测中的不平衡问题。如10倍交叉验证所示,KhibPred的性能令人满意,受试者工作特征曲线下面积为0.7937。因此,KhibPred可以成为预测蛋白质K位点的有用工具。特征分析表明,极性因子特征在K位点的预测中起重要作用。本研究得出的结论可能为深入研究K的分子机制提供有用的见解。