Taylor W R, Jones D T, Green N M
Laboratory of Mathematical Biology, National Institute for Medical Research, London, U.K.
Proteins. 1994 Mar;18(3):281-94. doi: 10.1002/prot.340180309.
Integral membrane proteins (of the alpha-helical class) are of central importance in a wide variety of vital cellular functions. Despite considerable effort on methods to predict the location of the helices, little attention has been directed toward developing an automatic method to pack the helices together. In principle, the prediction of membrane proteins should be easier than the prediction of globular proteins: there is only one type of secondary structure and all helices pack with a common alignment across the membrane. This allows all possible structures to be represented on a simple lattice and exhaustively enumerated. Prediction success lies not in generating many possible folds but in recognizing which corresponds to the native. Our evaluation of each fold is based on how well the exposed surface predicted from a multiple sequence alignment fits its allocated position. Just as exposure to solvent in globular proteins can be predicted from sequence variation, so exposure to lipid can be recognized by variable-hydrophobic (variphobic) positions. Application to both bacteriorhodopsin and the eukaryotic rhodopsin/opsin families revealed that the angular size of the lipid-exposed faces must be predicted accurately to allow selection of the correct fold. With the inherent uncertainties in helix prediction and parameter choice, this accuracy could not be guaranteed but the correct fold was typically found in the top six candidates. Our method provides the first completely automatic method that can proceed from a scan of the protein sequence databanks to a predicted three-dimensional structure with no intervention required from the investigator. Within the limited domain of the seven helix bundle proteins, a good chance can be given of selecting the correct structure. However, the limited number of sequences available with a corresponding known structure makes further characterization of the method difficult.
(α-螺旋类)整合膜蛋白在多种重要的细胞功能中起着核心作用。尽管在预测螺旋位置的方法上付出了巨大努力,但对于开发一种自动将螺旋组装在一起的方法却很少有人关注。原则上,膜蛋白的预测应该比球状蛋白的预测更容易:只有一种二级结构类型,并且所有螺旋都以共同的排列方式跨膜堆积。这使得所有可能的结构都可以在一个简单的晶格上表示并被详尽列举。预测的成功不在于生成许多可能的折叠,而在于识别出与天然结构相对应的折叠。我们对每个折叠的评估基于从多序列比对预测的暴露表面与其分配位置的匹配程度。就像球状蛋白中暴露于溶剂的情况可以从序列变异中预测一样,暴露于脂质的情况可以通过可变疏水性(变疏水)位置来识别。应用于细菌视紫红质和真核视紫红质/视蛋白家族表明,必须准确预测脂质暴露面的角大小,以便选择正确的折叠。由于螺旋预测和参数选择存在固有的不确定性,无法保证这种准确性,但通常在前六个候选结构中能找到正确的折叠。我们的方法提供了第一种完全自动的方法,该方法可以从扫描蛋白质序列数据库开始,到预测的三维结构,无需研究者进行干预。在七螺旋束蛋白的有限范围内,有很大机会选择正确的结构。然而,具有相应已知结构的可用序列数量有限,使得该方法的进一步表征变得困难。