Rost B, Casadio R, Fariselli P, Sander C
Protein Design Group, EMBL Heidelberg, Germany.
Protein Sci. 1995 Mar;4(3):521-33. doi: 10.1002/pro.5560040318.
We describe a neural network system that predicts the locations of transmembrane helices in integral membrane proteins. By using evolutionary information as input to the network system, the method significantly improved on a previously published neural network prediction method that had been based on single sequence information. The input data were derived from multiple alignments for each position in a window of 13 adjacent residues: amino acid frequency, conservation weights, number of insertions and deletions, and position of the window with respect to the ends of the protein chain. Additional input was the amino acid composition and length of the whole protein. A rigorous cross-validation test on 69 proteins with experimentally determined locations of transmembrane segments yielded an overall two-state per-residue accuracy of 95%. About 94% of all segments were predicted correctly. When applied to known globular proteins as a negative control, the network system incorrectly predicted fewer than 5% of globular proteins as having transmembrane helices. The method was applied to all 269 open reading frames from the complete yeast VIII chromosome. For 59 of these, at least two transmembrane helices were predicted. Thus, the prediction is that about one-fourth of all proteins from yeast VIII contain one transmembrane helix, and some 20%, more than one.
我们描述了一种神经网络系统,该系统可预测整合膜蛋白中跨膜螺旋的位置。通过将进化信息用作网络系统的输入,该方法在先前基于单序列信息发表的神经网络预测方法上有了显著改进。输入数据来自13个相邻残基窗口中每个位置的多序列比对:氨基酸频率、保守权重、插入和缺失的数量,以及窗口相对于蛋白质链末端的位置。额外的输入是整个蛋白质的氨基酸组成和长度。对69个具有实验确定跨膜片段位置的蛋白质进行的严格交叉验证测试,得到每个残基的总体二态准确率为95%。所有片段中约94%被正确预测。当作为阴性对照应用于已知的球状蛋白时,该网络系统错误预测为具有跨膜螺旋的球状蛋白不到5%。该方法应用于完整酵母VIII号染色体的所有269个开放阅读框。其中59个被预测至少有两个跨膜螺旋。因此,预测结果是酵母VIII号染色体中约四分之一的所有蛋白质含有一个跨膜螺旋,约20%的蛋白质含有一个以上的跨膜螺旋。