Xu Shuang, Hu Xiuzhen, Feng Zhenxing, Pang Jing, Sun Kai, You Xiaoxiao, Wang Ziyang
College of Sciences, Inner Mongolia University of Technology, Hohhot, China.
Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, China.
Front Genet. 2022 Jan 4;12:793800. doi: 10.3389/fgene.2021.793800. eCollection 2021.
The realization of many protein functions is inseparable from the interaction with ligands; in particular, the combination of protein and metal ion ligands performs an important biological function. Currently, it is a challenging work to identify the metal ion ligand-binding residues accurately by computational approaches. In this study, we proposed an improved method to predict the binding residues of 10 metal ion ligands (Zn, Cu, Fe, Fe, Co, Mn, Ca, Mg, Na, and K). Based on the basic feature parameters of amino acids, and physicochemical and predicted structural information, we added another two features of amino acid correlation information and binding residue propensity factors. With the optimized parameters, we used the GBM algorithm to predict metal ion ligand-binding residues. In the obtained results, the Sn and MCC values were over 10.17% and 0.297, respectively. Besides, the S and MCC values of transition metals were higher than 34.46% and 0.564, respectively. In order to test the validity of our model, another method (Random Forest) was also used in comparison. The better results of this work indicated that the proposed method would be a valuable tool to predict metal ion ligand-binding residues.
许多蛋白质功能的实现离不开与配体的相互作用;特别是蛋白质与金属离子配体的结合发挥着重要的生物学功能。目前,通过计算方法准确识别金属离子配体结合残基是一项具有挑战性的工作。在本研究中,我们提出了一种改进方法来预测10种金属离子配体(锌、铜、铁、铁、钴、锰、钙、镁、钠和钾)的结合残基。基于氨基酸的基本特征参数、理化性质和预测的结构信息,我们增加了氨基酸相关信息和结合残基倾向因子这两个特征。通过优化参数,我们使用梯度提升回归模型(GBM)算法来预测金属离子配体结合残基。在所得结果中,Sn和MCC值分别超过10.17%和0.297。此外,过渡金属的S和MCC值分别高于34.46%和0.564。为了测试我们模型的有效性,还使用了另一种方法(随机森林)进行比较。这项工作取得的较好结果表明,所提出的方法将成为预测金属离子配体结合残基的一个有价值的工具。