Rost B, Sander C, Schneider R
Comput Appl Biosci. 1994 Feb;10(1):53-60. doi: 10.1093/bioinformatics/10.1.53.
By the middle of 1993, > 30,000 protein sequences has been listed. For 1000 of these, the three-dimensional (tertiary) structure has been experimentally solved. Another 7000 can be modelled by homology. For the remaining 21,000 sequences, secondary structure prediction provides a rough estimate of structural features. Predictions in three states range between 35% (random) and 88% (homology modelling) overall accuracy. Using information about evolutionary conservation as contained in multiple sequence alignments, the secondary structure of 4700 protein sequences was predicted by the automatic e-mail server PHD. For proteins with at least one known homologue, the method has an expected overall three-state accuracy of 71.4% for proteins with at least one known homologue (evaluated on 126 unique protein chains).
到1993年年中,已列出超过30000个蛋白质序列。其中1000个的三维(三级)结构已通过实验解析。另外7000个可以通过同源性建模。对于其余21000个序列,二级结构预测提供了结构特征的粗略估计。三种状态下预测的总体准确率在35%(随机)到88%(同源性建模)之间。利用多序列比对中包含的进化保守信息,通过自动电子邮件服务器PHD预测了4700个蛋白质序列的二级结构。对于至少有一个已知同源物的蛋白质,该方法对于至少有一个已知同源物的蛋白质的预期总体三状态准确率为71.4%(在126条独特的蛋白质链上评估)。