Rost B, Sander C
European Molecular Biology Laboratory, Heidelberg, Germany.
J Mol Biol. 1993 Jul 20;232(2):584-99. doi: 10.1006/jmbi.1993.1413.
We have trained a two-layered feed-forward neural network on a non-redundant data base of 130 protein chains to predict the secondary structure of water-soluble proteins. A new key aspect is the use of evolutionary information in the form of multiple sequence alignments that are used as input in place of single sequences. The inclusion of protein family information in this form increases the prediction accuracy by six to eight percentage points. A combination of three levels of networks results in an overall three-state accuracy of 70.8% for globular proteins (sustained performance). If four membrane protein chains are included in the evaluation, the overall accuracy drops to 70.2%. The prediction is well balanced between alpha-helix, beta-strand and loop: 65% of the observed strand residues are predicted correctly. The accuracy in predicting the content of three secondary structure types is comparable to that of circular dichroism spectroscopy. The performance accuracy is verified by a sevenfold cross-validation test, and an additional test on 26 recently solved proteins. Of particular practical importance is the definition of a position-specific reliability index. For half of the residues predicted with a high level of reliability the overall accuracy increases to better than 82%. A further strength of the method is the more realistic prediction of segment length. The protein family prediction method is available for testing by academic researchers via an electronic mail server.
我们在一个由130条蛋白质链组成的非冗余数据库上训练了一个两层前馈神经网络,以预测水溶性蛋白质的二级结构。一个新的关键方面是使用多序列比对形式的进化信息,将其用作输入来替代单序列。以这种形式纳入蛋白质家族信息可将预测准确率提高6至8个百分点。三个层次网络的组合使得球状蛋白质的整体三态准确率达到70.8%(持续性能)。如果在评估中纳入四条膜蛋白链,整体准确率降至70.2%。该预测在α螺旋、β链和环之间实现了良好的平衡:65%的观察到的链残基被正确预测。预测三种二级结构类型含量的准确率与圆二色光谱法相当。通过七重交叉验证测试以及对26个最近解析的蛋白质进行的额外测试,验证了性能准确率。特别具有实际重要性的是位置特异性可靠性指数的定义。对于一半预测可靠性高的残基,整体准确率提高到82%以上。该方法的另一个优势是对片段长度的预测更符合实际情况。学术研究人员可通过电子邮件服务器对蛋白质家族预测方法进行测试。