Gonnet G H, Cohen M A, Benner S A
Institute for Scientific Computation, Swiss Federal Institute of Technology, Zurich, Switzerland.
Science. 1992 Jun 5;256(5062):1443-5. doi: 10.1126/science.1604319.
The entire protein sequence database has been exhaustively matched. Definitive mutation matrices and models for scoring gaps were obtained from the matching and used to organize the sequence database as sets of evolutionarily connected components. The methods developed are general and can be used to manage sequence data generated by major genome sequencing projects. The alignments made possible by the exhaustive matching are the starting point for successful de novo prediction of the folded structures of proteins, for reconstructing sequences of ancient proteins and metabolisms in ancient organisms, and for obtaining new perspectives in structural biochemistry.
整个蛋白质序列数据库已被彻底匹配。通过匹配获得了确定的突变矩阵和空位评分模型,并用于将序列数据库组织成进化上相连的组件集。所开发的方法具有通用性,可用于管理主要基因组测序项目产生的序列数据。通过彻底匹配实现的比对是成功从头预测蛋白质折叠结构、重建古代生物中古代蛋白质和代谢序列以及在结构生物化学中获得新视角的起点。