Bagley S C, Wei L, Cheng C, Altman R B
Section on Medical Informatics, Stanford University School of Medicine, CA 94305-5479, USA.
Proc Int Conf Intell Syst Mol Biol. 1995;3:12-20.
A protein site is a region of a three-dimensional protein structure with a distinguishing functional or structural role. Certain sites recur in different protein structures (for example catalytic sites, calcium binding sites, and some types of turns), but maintain critical shared features. To facilitate the analysis of such protein sites, we have developed a computer system for analyzing the spatial distributions of biochemical properties around a site. The system takes a set of similar sites and a set of control nonsites, and finds differences between them. Specifically, it compares distributions of the properties surrounding the sites with those surrounding the nonsites, and reports statistically significant differences. In this paper, we use our method to analyze the features in the active site of the serine protease enzymes. We compare the use of radial distributions (shells) with 3-D grids (blocks) in the analysis of the active site. We demonstrate three different strategies for focusing attention on significant findings, based on properties of interest, spatial volumes of interest, and on the level of statistical significance. Finally, we show that the program automatically identifies conserved sequential, secondary structural and biophysical features of the serine protease active site, using noncatalytic histidine residues as a control environment.
蛋白质位点是三维蛋白质结构中具有独特功能或结构作用的区域。某些位点在不同的蛋白质结构中反复出现(例如催化位点、钙结合位点以及某些类型的转角),但保持关键的共同特征。为便于分析此类蛋白质位点,我们开发了一个计算机系统,用于分析位点周围生化特性的空间分布。该系统采用一组相似位点和一组对照非位点,并找出它们之间的差异。具体而言,它比较位点周围特性的分布与非位点周围特性的分布,并报告具有统计学显著性的差异。在本文中,我们使用我们的方法来分析丝氨酸蛋白酶的活性位点特征。我们在活性位点分析中比较了径向分布(壳层)和三维网格(块)的使用情况。我们基于感兴趣的特性、感兴趣的空间体积以及统计显著性水平,展示了三种不同的策略来聚焦于显著发现。最后,我们表明该程序使用非催化组氨酸残基作为对照环境,自动识别丝氨酸蛋白酶活性位点的保守序列、二级结构和生物物理特征。