Verdonk M L, Cole J C, Watson P, Gillet V, Willett P
Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB2 1EZ, UK.
J Mol Biol. 2001 Mar 30;307(3):841-59. doi: 10.1006/jmbi.2001.4452.
SuperStar is an empirical method for identifying interaction sites in proteins, based entirely on experimental information about non-bonded interactions occurring in small-molecule crystal structures, taken from the IsoStar database. We describe recent modifications and additions to SuperStar, validating the results on a test set of 122 X-ray structures of protein-ligand complexes. In this validation, propensity maps are generated for all the binding sites of these proteins, using four different probes: a charged NH(+)(3) nitrogen atom, a carbonyl oxygen atom, a hydroxyl oxygen atom and a methyl carbon atom. Next, the maps are compared with the experimentally observed positions of ligand atoms of these types. A peak-searching algorithm is introduced that highlights potential interaction hot spots. For the three hydrogen-bonding probes - NH(+)(3) nitrogen atom, carbonyl oxygen atom and hydroxyl oxygen atom - the average distance from the ligand atom to the nearest SuperStar peak is 1.0-1.2 A (0.8-1.0 A for solvent-inaccessible ligand atoms). For the methyl carbon atom probe, this distance is about 1.5 A, probably because interactions to methyl groups are much less directional. The most important addition to SuperStar is the enabling of propensity maps around metal centres - Ca(2+), Mg(2+) and Zn(2+) - in protein binding sites. The results are validated on a test set of 24 protein-ligand complexes that have a metal ion in their binding site. Coordination geometries are derived automatically, using only the protein atoms that coordinate to the metal ion. The correct coordination geometry is derived in approximately 75 % of the cases. If the derived geometry is assumed during the SuperStar calculation, the average distance from a ligand atom coordinating to the metal ion to the nearest peak in the propensity map for an oxygen probe is 0.87(7) A. If the correct coordination geometry is imposed, this distance reduces to 0.59(7)A. This indicates that the SuperStar predictions around metal-binding sites are at least as good as those around other protein groups. Using clustering techniques, a non-redundant set of probes is selected from the set of probes available in the IsoStar database. The performance in SuperStar of all these probes is tested on the test set of protein-ligand complexes. With the exception of the "ether oxygen" probe and the "any NH(+)" probe, all new probes perform as well as the four probes introduced first.
SuperStar是一种用于识别蛋白质中相互作用位点的经验方法,它完全基于从小分子晶体结构中获取的非键相互作用的实验信息,这些信息来自IsoStar数据库。我们描述了SuperStar最近的修改和补充内容,并在一组包含122个蛋白质-配体复合物X射线结构的测试集上验证了结果。在该验证过程中,使用四种不同的探针为这些蛋白质的所有结合位点生成倾向图:一个带正电荷的NH₃⁺氮原子、一个羰基氧原子、一个羟基氧原子和一个甲基碳原子。接下来,将这些图与实验观察到的这些类型配体原子的位置进行比较。引入了一种峰值搜索算法,该算法突出显示潜在的相互作用热点。对于三种氢键探针——NH₃⁺氮原子、羰基氧原子和羟基氧原子——配体原子到最近的SuperStar峰值的平均距离为1.0 - 1.2 Å(对于溶剂不可及的配体原子为0.8 - 1.0 Å)。对于甲基碳原子探针,这个距离约为1.5 Å,可能是因为与甲基的相互作用方向性要小得多。SuperStar最重要的补充是能够生成蛋白质结合位点周围金属中心(Ca²⁺、Mg²⁺和Zn²⁺)的倾向图。在一组结合位点含有金属离子的24个蛋白质-配体复合物的测试集上验证了结果。仅使用与金属离子配位的蛋白质原子自动推导配位几何结构。在大约75%的情况下能推导出正确的配位几何结构。如果在SuperStar计算过程中假设推导出的几何结构,对于氧探针,与金属离子配位的配体原子到倾向图中最近峰值的平均距离为0.87(7) Å。如果施加正确的配位几何结构,这个距离会减小到0.59(7) Å。这表明围绕金属结合位点的SuperStar预测至少与围绕其他蛋白质基团的预测一样好。使用聚类技术,从IsoStar数据库中可用的探针集中选择了一组非冗余探针。在蛋白质-配体复合物测试集上测试了所有这些探针在SuperStar中的性能。除了“醚氧”探针和“任何NH⁺”探针外,所有新探针的性能都与最初引入的四种探针一样好。