Xu Qifang, Canutescu Adrian A, Wang Guoli, Shapovalov Maxim, Obradovic Zoran, Dunbrack Roland L
Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA.
J Mol Biol. 2008 Aug 29;381(2):487-507. doi: 10.1016/j.jmb.2008.06.002. Epub 2008 Jun 7.
Many proteins function as homo-oligomers and are regulated via their oligomeric state. For some proteins, the stoichiometry of homo-oligomeric states under various conditions has been studied using gel filtration or analytical ultracentrifugation experiments. The interfaces involved in these assemblies may be identified using cross-linking and mass spectrometry, solution-state NMR, and other experiments. However, for most proteins, the actual interfaces that are involved in oligomerization are inferred from X-ray crystallographic structures using assumptions about interface surface areas and physical properties. Examination of interfaces across different Protein Data Bank (PDB) entries in a protein family reveals several important features. First, similarities in space group, asymmetric unit size, and cell dimensions and angles (within 1%) do not guarantee that two crystals are actually the same crystal form, containing similar relative orientations and interactions within the crystal. Conversely, two crystals in different space groups may be quite similar in terms of all the interfaces within each crystal. Second, NMR structures and an existing benchmark of PDB crystallographic entries consisting of 126 dimers as well as larger structures and 132 monomers were used to determine whether the existence or lack of common interfaces across multiple crystal forms can be used to predict whether a protein is an oligomer or not. Monomeric proteins tend to have common interfaces across only a minority of crystal forms, whereas higher-order structures exhibit common interfaces across a majority of available crystal forms. The data can be used to estimate the probability that an interface is biological if two or more crystal forms are available. Finally, the Protein Interfaces, Surfaces, and Assemblies (PISA) database available from the European Bioinformatics Institute is more consistent in identifying interfaces observed in many crystal forms compared with the PDB and the European Bioinformatics Institute's Protein Quaternary Server (PQS). The PDB, in particular, is missing highly likely biological interfaces in its biological unit files for about 10% of PDB entries.
许多蛋白质作为同型寡聚体发挥功能,并通过其寡聚状态进行调节。对于一些蛋白质,已使用凝胶过滤或分析超速离心实验研究了在各种条件下同型寡聚状态的化学计量。这些组装中涉及的界面可通过交联和质谱、溶液态核磁共振及其他实验来鉴定。然而,对于大多数蛋白质,参与寡聚化的实际界面是根据X射线晶体学结构,利用关于界面表面积和物理性质的假设推断出来的。对蛋白质家族中不同蛋白质数据银行(PDB)条目的界面进行检查揭示了几个重要特征。首先,空间群、不对称单元大小以及晶胞尺寸和角度(在1%以内)的相似性并不能保证两个晶体实际上是相同的晶体形式,即在晶体内包含相似的相对取向和相互作用。相反,不同空间群中的两个晶体在每个晶体内的所有界面方面可能非常相似。其次,利用核磁共振结构以及由126个二聚体以及更大结构和132个单体组成的PDB晶体学条目的现有基准,来确定跨多种晶体形式的共同界面的存在或缺失是否可用于预测一种蛋白质是否为寡聚体。单体蛋白往往仅在少数晶体形式中具有共同界面,而高阶结构在大多数可用晶体形式中表现出共同界面。如果有两种或更多种晶体形式,这些数据可用于估计一个界面具有生物学意义的概率。最后,与PDB和欧洲生物信息学研究所的蛋白质四级结构服务器(PQS)相比,欧洲生物信息学研究所提供的蛋白质界面、表面和组装(PISA)数据库在识别许多晶体形式中观察到的界面方面更加一致。特别是,PDB在其约10%的PDB条目的生物学单元文件中缺少极有可能具有生物学意义的界面。