Graduate Institute of Molecular Systems Biomedicine, China Medical University, Taichung, Taiwan.
PLoS One. 2012;7(6):e39252. doi: 10.1371/journal.pone.0039252. Epub 2012 Jun 18.
The structure of a protein determines its function and its interactions with other factors. Regions of proteins that interact with ligands, substrates, and/or other proteins, tend to be conserved both in sequence and structure, and the residues involved are usually in close spatial proximity. More than 70,000 protein structures are currently found in the Protein Data Bank, and approximately one-third contain metal ions essential for function. Identifying and characterizing metal ion-binding sites experimentally is time-consuming and costly. Many computational methods have been developed to identify metal ion-binding sites, and most use only sequence information. For the work reported herein, we developed a method that uses sequence and structural information to predict the residues in metal ion-binding sites. Six types of metal ion-binding templates- those involving Ca(2+), Cu(2+), Fe(3+), Mg(2+), Mn(2+), and Zn(2+)-were constructed using the residues within 3.5 Å of the center of the metal ion. Using the fragment transformation method, we then compared known metal ion-binding sites with the templates to assess the accuracy of our method. Our method achieved an overall 94.6 % accuracy with a true positive rate of 60.5 % at a 5 % false positive rate and therefore constitutes a significant improvement in metal-binding site prediction.
蛋白质的结构决定了其功能以及与其他因素的相互作用。与配体、底物和/或其他蛋白质相互作用的蛋白质区域在序列和结构上往往都具有保守性,并且涉及的残基通常在空间上非常接近。目前,蛋白质数据库中包含超过 70,000 种蛋白质结构,其中约有三分之一包含对功能至关重要的金属离子。实验鉴定和表征金属离子结合位点既耗时又昂贵。已经开发出许多用于识别金属离子结合位点的计算方法,其中大多数仅使用序列信息。在本文报告的工作中,我们开发了一种使用序列和结构信息来预测金属离子结合位点残基的方法。使用距离金属离子中心 3.5 Å 内的残基构建了六种类型的金属离子结合模板 - 涉及 Ca(2+)、Cu(2+)、Fe(3+)、Mg(2+)、Mn(2+)和 Zn(2+)的模板。然后,我们使用片段转换方法将已知的金属离子结合位点与模板进行比较,以评估我们方法的准确性。我们的方法在 5%的假阳性率下达到了 94.6%的总体准确率,真阳性率为 60.5%,因此在金属结合位点预测方面有了显著的改进。