School of Mathematics and Information Science, Guangzhou University, Guangzhou, 510006, China.
School of Mathematics and Information Science, Guangzhou University, Guangzhou, 510006, China.
Anal Biochem. 2019 Feb 1;566:75-88. doi: 10.1016/j.ab.2018.11.009. Epub 2018 Nov 9.
Accurately targeting metal ion-binding sites solely from protein sequences is valuable for both basic experimental biology and drug discovery studies. Although considerable progress has been made, metal ion-binding site prediction is still a challenging problem due to the small size and high versatility of the metal ions. In this paper, we develop a ligand-specific predictor called MIonSite for predicting metal ion-binding sites from protein sequences. MIonSite first employs protein evolutionary information, predicted secondary structure, predicted solvent accessibility, and conservation information calculated by Jensen-Shannon Divergence score to extract the discriminative feature of each residue. An enhanced AdaBoost algorithm is then designed to cope with the serious imbalance problem buried in the metal ion-binding site prediction, where the number of non-binding sites is far more than that of metal ion-binding sites. A new gold-standard benchmark dataset, consisting of training and independent validation subsets of Zn, Ca, Mg, Mn, Fe, Cu, Fe, Co, Na, K, Cd, and Ni, is constructed to evaluate the proposed MIonSite with other existing predictors. Experimental results demonstrate that the proposed MIonSite achieves high prediction performance and outperforms other state-of-the-art sequence-based predictors. The standalone program of MIonSite and corresponding datasets can be freely downloaded at https://github.com/LiangQiaoGu/MIonSite.git for academic use.
准确地从蛋白质序列中靶向金属离子结合位点对于基础实验生物学和药物发现研究都具有重要意义。尽管已经取得了相当大的进展,但由于金属离子的体积小且多功能性,金属离子结合位点的预测仍然是一个具有挑战性的问题。在本文中,我们开发了一种称为 MIonSite 的配体特异性预测器,用于从蛋白质序列中预测金属离子结合位点。MIonSite 首先利用蛋白质进化信息、预测的二级结构、预测的溶剂可及性以及由 Jensen-Shannon 散度得分计算的保守信息来提取每个残基的鉴别特征。然后设计了一种增强的 AdaBoost 算法来应对金属离子结合位点预测中隐藏的严重不平衡问题,其中非结合位点的数量远远超过金属离子结合位点的数量。构建了一个新的金标准基准数据集,包括 Zn、Ca、Mg、Mn、Fe、Cu、Fe、Co、Na、K、Cd 和 Ni 的训练子集和独立验证子集,用于评估所提出的 MIonSite 与其他现有预测器的性能。实验结果表明,所提出的 MIonSite 具有较高的预测性能,优于其他基于序列的最新预测器。MIonSite 的独立程序和相应的数据集可在 https://github.com/LiangQiaoGu/MIonSite.git 上免费下载,供学术使用。