Albou Laurent-Philippe, Schwarz Benjamin, Poch Olivier, Wurtz Jean Marie, Moras Dino
Department of Biology and Structural Genomics, IGBMC, CNRS, INSERM, ULP, Ilkirch, France.
Proteins. 2009 Jul;76(1):1-12. doi: 10.1002/prot.22301.
The alpha shape of a molecule is a geometrical representation that provides a unique surface decomposition and a means to filter atomic contacts. We used it to revisit and unify the definition and computation of surface residues, contiguous patches, and curvature. These descriptors are evaluated and compared with former approaches on 85 proteins for which both bound and unbound forms are available. Based on the local density of interactions, the detection of surface residues shows a sensibility of 98%, whereas preserving a well-formed protein core. A novel conception of surface patch is defined by traveling along the surface from a central residue or atom. By construction, all surface patches are contiguous and, therefore, allows to cope with common problems of wrong and nonselection of neighbors. In the case of protein-binding site prediction, this new definition has improved the signal-to-noise ratio by 2.6 times compared with a widely used approach. With most common approaches, the computation of surface curvature can be locally biased by the presence of subsurface cavities and local variations of atomic densities. A novel notion of surface curvature is specifically developed to avoid such bias and is parametrizable to emphasize either local or global features. It defines a molecular landscape composed on average of 38% knobs and 62% clefts where interacting residues (IR) are 30% more frequent in knobs. A statistical analysis shows that residues in knobs are more charged, less hydrophobic and less aromatic than residues in clefts. IR in knobs are, however, much more hydrophobic and aromatic and less charged than noninteracting residues (non-IR) in knobs. Furthermore, IR are shown to be more accessible than non-IR both in clefts and knobs. The use of the alpha shape as a unifying framework allows for formal definitions, and fast and robust computations desirable in large-scale projects. This swiftness is not achieved to the detriment of quality, as proven by valid improvements compared with former approaches. In addition, our approach is general enough to be applied on nucleic acids and any other biomolecules.
分子的α形状是一种几何表示,它提供了独特的表面分解和过滤原子接触的方法。我们用它来重新审视并统一表面残基、连续斑块和曲率的定义及计算方法。对85种既有结合形式又有未结合形式的蛋白质,我们对这些描述符进行了评估,并与以前的方法进行了比较。基于相互作用的局部密度,表面残基的检测灵敏度为98%,同时能保持蛋白质核心结构的完好。通过从中心残基或原子沿着表面移动来定义一种新的表面斑块概念。通过构建,所有表面斑块都是连续的,因此能够解决邻域选择错误和未选择等常见问题。在蛋白质结合位点预测的情况下,与一种广泛使用的方法相比,这个新定义将信噪比提高了2.6倍。对于大多数常见方法,表面曲率的计算可能会因表面下空洞的存在和原子密度的局部变化而产生局部偏差。专门开发了一种新的表面曲率概念以避免这种偏差,并且可以进行参数化以强调局部或全局特征。它定义了一个平均由38%的瘤状区域和62%的裂隙区域组成的分子景观,其中相互作用残基(IR)在瘤状区域中的出现频率比裂隙区域高30%。统计分析表明,瘤状区域中的残基比裂隙区域中的残基带更多电荷、疏水性更低且芳香性更低。然而,瘤状区域中的IR比瘤状区域中的非相互作用残基(非IR)疏水性和芳香性更强且带电荷更少。此外,在裂隙区域和瘤状区域中,IR都比非IR更容易接近。使用α形状作为统一框架能够实现形式化定义,以及在大规模项目中所需的快速且稳健的计算。这种快速性并没有以牺牲质量为代价,与以前的方法相比得到的有效改进证明了这一点。此外,我们的方法具有足够的通用性,可应用于核酸和任何其他生物分子。