Liang J, Edelsbrunner H, Woodward C
Department of Biochemistry, University of Minnesota, St. Paul 55108, USA.
Protein Sci. 1998 Sep;7(9):1884-97. doi: 10.1002/pro.5560070905.
Identification and size characterization of surface pockets and occluded cavities are initial steps in protein structure-based ligand design. A new program, CAST, for automatically locating and measuring protein pockets and cavities, is based on precise computational geometry methods, including alpha shape and discrete flow theory. CAST identifies and measures pockets and pocket mouth openings, as well as cavities. The program specifies the atoms lining pockets, pocket openings, and buried cavities; the volume and area of pockets and cavities; and the area and circumference of mouth openings. CAST analysis of over 100 proteins has been carried out; proteins examined include a set of 51 monomeric enzyme-ligand structures, several elastase-inhibitor complexes, the FK506 binding protein, 30 HIV-1 protease-inhibitor complexes, and a number of small and large protein inhibitors. Medium-sized globular proteins typically have 10-20 pockets/cavities. Most often, binding sites are pockets with 1-2 mouth openings; much less frequently they are cavities. Ligand binding pockets vary widely in size, most within the range 10(2)-10(3)A3. Statistical analysis reveals that the number of pockets and cavities is correlated with protein size, but there is no correlation between the size of the protein and the size of binding sites. Most frequently, the largest pocket/cavity is the active site, but there are a number of instructive exceptions. Ligand volume and binding site volume are somewhat correlated when binding site volume is < or =700 A3, but the ligand seldom occupies the entire site. Auxiliary pockets near the active site have been suggested as additional binding surface for designed ligands (Mattos C et al., 1994, Nat Struct Biol 1:55-58). Analysis of elastase-inhibitor complexes suggests that CAST can identify ancillary pockets suitable for recruitment in ligand design strategies. Analysis of the FK506 binding protein, and of compounds developed in SAR by NMR (Shuker SB et al., 1996, Science 274:1531-1534), indicates that CAST pocket computation may provide a priori identification of target proteins for linked-fragment design. CAST analysis of 30 HIV-1 protease-inhibitor complexes shows that the flexible active site pocket can vary over a range of 853-1,566 A3, and that there are two pockets near or adjoining the active site that may be recruited for ligand design.
识别表面口袋和封闭腔并对其进行尺寸表征是基于蛋白质结构的配体设计的初始步骤。一个用于自动定位和测量蛋白质口袋及腔的新程序CAST,基于精确的计算几何方法,包括α形状和离散流理论。CAST可识别并测量口袋、口袋开口以及腔。该程序能指定构成口袋、口袋开口和埋藏腔的原子;口袋和腔的体积与面积;以及开口的面积和周长。已对100多种蛋白质进行了CAST分析;所检测的蛋白质包括一组51个单体酶 - 配体结构、几种弹性蛋白酶 - 抑制剂复合物、FK506结合蛋白、30个HIV - 1蛋白酶 - 抑制剂复合物以及一些大小不同的蛋白质抑制剂。中等大小的球状蛋白质通常有10 - 20个口袋/腔。大多数情况下,结合位点是有1 - 2个开口的口袋;很少是腔。配体结合口袋的大小差异很大,大多数在10² - 10³ ų范围内。统计分析表明,口袋和腔的数量与蛋白质大小相关,但蛋白质大小与结合位点大小之间没有相关性。最常见的是,最大的口袋/腔是活性位点,但也有一些有启发性的例外情况。当结合位点体积≤700 ų时,配体体积与结合位点体积有些相关,但配体很少占据整个位点。活性位点附近的辅助口袋已被认为是设计配体的额外结合表面(马托斯C等人,1994年,《自然结构生物学》1:55 - 58)。对弹性蛋白酶 - 抑制剂复合物的分析表明,CAST能够识别适合在配体设计策略中用于招募的辅助口袋。对FK506结合蛋白以及通过核磁共振在药物活性研究中开发的化合物的分析(舒克SB等人,1996年,《科学》274:1531 - 1534)表明CAST口袋计算可为连接片段设计提供目标蛋白质的先验识别。对30个HIV - 1蛋白酶 - 抑制剂复合物的CAST分析表明,灵活的活性位点口袋大小可在853 - 1566 ų范围内变化,并且在活性位点附近或相邻处有两个口袋可用于配体设计。