Deng Hai, Chen Guantao, Yang Wei, Yang Jenny J
Department of Computer Science, Georgia State University, Atlanta, Georgia 30302, USA.
Proteins. 2006 Jul 1;64(1):34-42. doi: 10.1002/prot.20973.
Identifying calcium-binding sites in proteins is one of the first steps towards predicting and understanding the role of calcium in biological systems for protein structure and function studies. Due to the complexity and irregularity of calcium-binding sites, a fast and accurate method for predicting and identifying calcium-binding protein is needed. Here we report our development of a new fast algorithm (GG) to detect calcium-binding sites. The GG algorithm uses a graph theory algorithm to find oxygen clusters of the protein and a geometric algorithm to identify the center of these clusters. A cluster of four or more oxygen atoms has a high potential for calcium binding. High performance with about 90% site sensitivity and 80% site selectivity has been obtained for three datasets containing a total of 123 proteins. The results suggest that a sphere of a certain size with four or more oxygen atoms on the surface and without other atoms inside is necessary and sufficient for quickly identifying the majority of the calcium-binding sites with high accuracy. Our finding opens a new avenue to visualize and analyze calcium-binding sites in proteins facilitating the prediction of functions from structural genomic information.
识别蛋白质中的钙结合位点是预测和理解钙在生物系统中对蛋白质结构和功能作用的首要步骤之一。由于钙结合位点的复杂性和不规则性,需要一种快速且准确的方法来预测和识别钙结合蛋白。在此,我们报告了一种用于检测钙结合位点的新快速算法(GG)的开发。GG算法使用图论算法来寻找蛋白质的氧簇,并使用几何算法来识别这些簇的中心。四个或更多氧原子的簇具有较高的钙结合潜力。对于总共包含123种蛋白质的三个数据集,已获得约90%的位点敏感性和80%的位点选择性的高性能。结果表明,表面有四个或更多氧原子且内部无其他原子的特定大小的球体对于快速准确地识别大多数钙结合位点是必要且充分的。我们的发现为可视化和分析蛋白质中的钙结合位点开辟了一条新途径,有助于从结构基因组信息预测功能。