Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, GA 30076, USA.
Bioinformatics. 2013 Mar 1;29(5):597-604. doi: 10.1093/bioinformatics/btt024. Epub 2013 Jan 17.
Most proteins interact with small-molecule ligands such as metabolites or drug compounds. Over the past several decades, many of these interactions have been captured in high-resolution atomic structures. From a geometric point of view, most interaction sites for grasping these small-molecule ligands, as revealed in these structures, form concave shapes, or 'pockets', on the protein's surface. An efficient method for comparing these pockets could greatly assist the classification of ligand-binding sites, prediction of protein molecular function and design of novel drug compounds.
We introduce a computational method, APoc (Alignment of Pockets), for the large-scale, sequence order-independent, structural comparison of protein pockets. A scoring function, the Pocket Similarity Score (PS-score), is derived to measure the level of similarity between pockets. Statistical models are used to estimate the significance of the PS-score based on millions of comparisons of randomly related pockets. APoc is a general robust method that may be applied to pockets identified by various approaches, such as ligand-binding sites as observed in experimental complex structures, or predicted pockets identified by a pocket-detection method. Finally, we curate large benchmark datasets to evaluate the performance of APoc and present interesting examples to demonstrate the usefulness of the method. We also demonstrate that APoc has better performance than the geometric hashing-based method SiteEngine.
The APoc software package including the source code is freely available at http://cssb.biology.gatech.edu/APoc.
大多数蛋白质与小分子配体(如代谢物或药物化合物)相互作用。在过去的几十年中,这些相互作用中的许多已经在高分辨率原子结构中被捕获。从几何角度来看,这些结构中揭示的大多数用于抓取这些小分子配体的相互作用位点在蛋白质表面形成凹形或“口袋”形状。一种用于比较这些口袋的有效方法可以极大地帮助配体结合位点的分类、蛋白质分子功能的预测和新型药物化合物的设计。
我们引入了一种计算方法 APoc(口袋对齐),用于大规模、序列顺序独立、蛋白质口袋的结构比较。一种评分函数,口袋相似性评分(PS-评分),用于测量口袋之间相似性的水平。统计模型用于根据数百万次随机相关口袋的比较来估计 PS 评分的显著性。APoc 是一种通用的稳健方法,可应用于通过各种方法识别的口袋,例如实验复合物结构中观察到的配体结合位点,或通过口袋检测方法识别的预测口袋。最后,我们整理了大型基准数据集来评估 APoc 的性能,并呈现了有趣的示例来展示该方法的有用性。我们还证明了 APoc 比基于几何哈希的方法 SiteEngine 具有更好的性能。
包括源代码在内的 APoc 软件包可在 http://cssb.biology.gatech.edu/APoc 上免费获得。