Jacoboni I, Martelli P L, Fariselli P, Compiani M, Casadio R
Laboratory of Biocomputing, Centro Interdipartimentale per le Ricerche Biotecnologiche (CIRB), Bologna, Italy.
Proteins. 2000 Dec 1;41(4):535-44. doi: 10.1002/1097-0134(20001201)41:4<535::aid-prot100>3.0.co;2-c.
The most stringent test for predictive methods of protein secondary structure is whether identical short sequences that are known to be present with different conformations in different proteins known at atomic resolution can be correctly discriminated. In this study, we show that the prediction efficiency of this type of segments in unrelated proteins reaches an average accuracy per residue ranging from about 72 to 75% (depending on the alignment method used to generate the input sequence profile) only when methods of the third generation are used. A comparison of different methods based on segment statistics (2nd generation methods) and/or including also evolutionary information (3rd generation methods) indicate that the discrimination of the different conformations of identical segments is dependent on the method used for the prediction. Accuracy is similar when methods similarly performing on the secondary structure prediction are tested. When evolutionary information is taken into account as compared to single sequence input, the number of correctly discriminated pairs is increased twofold. The results also highlight the predictive capability of neural networks for identical segments whose conformation differs in different proteins.
对蛋白质二级结构预测方法最严格的测试是,在原子分辨率已知的不同蛋白质中,具有不同构象的相同短序列能否被正确区分。在本研究中,我们表明,只有使用第三代方法时,无关蛋白质中这类片段的预测效率才能达到每个残基约72%至75%的平均准确率(取决于用于生成输入序列谱的比对方法)。基于片段统计的不同方法(第二代方法)和/或也包含进化信息的方法(第三代方法)的比较表明,相同片段不同构象的区分取决于用于预测的方法。对在二级结构预测中表现相似的方法进行测试时,准确率相近。与单序列输入相比,当考虑进化信息时,正确区分的序列对数量增加了两倍。结果还突出了神经网络对不同蛋白质中构象不同的相同片段的预测能力。