Ye Yuzhen, Li Zhanwen, Godzik Adam
Bioinformatics and Systems Biology Program, The Burnham Institute, La Jolla, CA 92037, USA.
Pac Symp Biocomput. 2006:439-50.
Three-dimensional structures of proteins, experimental or predicted, show us how these molecular machines actually work. With the help of information on disease-related mutations, they can also show us how they malfunction in diseases. Such understanding, currently lacking for most human diseases, is an important first step before designing drugs or therapies to cure specific diseases. Here we used homology modeling to model human disease-related proteins, and studied structural characteristics of disease related mutations and compared them with non synonymous SNPs. 1484 domains from 874 proteins were modeled, and together with experimentally determined structures of 369 domains they provided the structural coverage of 48% of total residues in 1237 human disease proteins. We found that disease-related mutations have statistically significantly preference to form clusters on protein surfaces. In contrast, the non-synonymous SNPs appear to be randomly distributed on the surface. We interpret these results as an indication that disease mutations affect protein-protein interaction interfaces. This interpretation is supported by the analysis of 8 experimentally determined complexes between disease proteins, where disease-related mutations are clearly located in the binding interface of proteins, while SNPs are not. The non-uniform distribution of disease mutations indicates that we can use this feature as guidance in modeling and evaluating human disease proteins and their complexes. We set up a resource for Disease Protein Models (DPM at http://ffas.burnham.org/DPM), which can be used for studying the relation between disease and mutation/polymorphism sites in the context of protein 3D structures and complexes.
蛋白质的三维结构,无论是实验测定的还是预测的,都向我们展示了这些分子机器的实际工作方式。借助与疾病相关的突变信息,它们还能向我们展示这些分子机器在疾病中是如何发生故障的。对于大多数人类疾病而言,目前仍缺乏这样的认识,而这是设计治疗特定疾病的药物或疗法之前重要的第一步。在这里,我们使用同源建模对人类疾病相关蛋白质进行建模,研究疾病相关突变的结构特征,并将它们与非同义单核苷酸多态性进行比较。对来自874种蛋白质的1484个结构域进行了建模,再加上369个结构域的实验测定结构,它们覆盖了1237种人类疾病蛋白质中48%的总残基。我们发现,疾病相关突变在蛋白质表面形成簇的偏好具有统计学显著性。相比之下,非同义单核苷酸多态性似乎随机分布在表面。我们将这些结果解释为疾病突变会影响蛋白质 - 蛋白质相互作用界面的一个迹象。对8种疾病蛋白质之间实验测定的复合物的分析支持了这一解释,其中疾病相关突变明显位于蛋白质的结合界面,而非单核苷酸多态性。疾病突变的非均匀分布表明,我们可以利用这一特征作为建模和评估人类疾病蛋白质及其复合物的指导。我们建立了一个疾病蛋白质模型资源库(DPM,网址为http://ffas.burnham.org/DPM),可用于在蛋白质三维结构和复合物的背景下研究疾病与突变/多态性位点之间的关系。