Chandonia J M, Karplus M
Biophysics Program, Harvard University, Cambridge, Massachusetts 02138, USA.
Protein Sci. 1995 Feb;4(2):275-85. doi: 10.1002/pro.5560040214.
A pair of neural network-based algorithms is presented for predicting the tertiary structural class and the secondary structure of proteins. Each algorithm realizes improvements in accuracy based on information provided by the other. Structural class prediction of proteins nonhomologous to any in the training set is improved significantly, from 62.3% to 73.9%, and secondary structure prediction accuracy improves slightly, from 62.26% to 62.64%. A number of aspects of neural network optimization and testing are examined. They include network overtraining and an output filter based on a rolling average. Secondary structure prediction results vary greatly depending on the particular proteins chosen for the training and test sets; consequently, an appropriate measure of accuracy reflects the more unbiased approach of "jackknife" cross-validation (testing each protein in the data-base individually).
提出了一对基于神经网络的算法,用于预测蛋白质的三级结构类别和二级结构。每种算法都基于另一种算法提供的信息实现了准确性的提高。与训练集中任何蛋白质都不同源的蛋白质的结构类别预测有显著提高,从62.3%提高到73.9%,二级结构预测准确性略有提高,从62.26%提高到62.64%。研究了神经网络优化和测试的多个方面。包括网络过度训练和基于移动平均的输出滤波器。二级结构预测结果因选择用于训练集和测试集的特定蛋白质而有很大差异;因此,一种合适的准确性度量反映了“留一法”交叉验证(单独测试数据库中的每种蛋白质)这种更无偏差的方法。