Benner S A, Badcoe I, Cohen M A, Gerloff D L
Laboratory for Organic Chemistry E.T.H., Zurich, Switzerland.
J Mol Biol. 1994 Jan 21;235(3):926-58. doi: 10.1006/jmbi.1994.1049.
Heuristics have been developed for analyzing patterns of conservation and variation within a set of aligned homologous protein sequences for the purpose of assigning amino acids whose side-chains lie on the surface and inside the folded structure of a protein. These were used in several recent bona fide predictions of the secondary structure of proteins from sequence data, made and published before crystallographic information became available. Heuristics based on concurrent hydrophilic variation identify positions that lie on the surface. Heuristics based on concurrent hydrophobic conservation and variation identify positions lying in the interior. These heuristics are described here in detail and their performance evaluated when applied to seven protein families with known three-dimensional structures. The performance of individual heuristics is shown to depend on the nature of the multiple alignment within the protein family, and a strategy is presented for obtaining surface and interior assignments useful for predicting secondary structure.
已经开发出一些启发式方法,用于分析一组比对后的同源蛋白质序列中的保守模式和变异模式,目的是确定那些侧链位于蛋白质表面和折叠结构内部的氨基酸。在晶体学信息可用之前,这些方法被用于最近一些基于序列数据对蛋白质二级结构进行的真正预测中。基于同时发生的亲水性变异的启发式方法可识别位于表面的位置。基于同时发生的疏水性保守和变异的启发式方法可识别位于内部的位置。本文详细描述了这些启发式方法,并在将其应用于七个具有已知三维结构的蛋白质家族时评估了它们的性能。结果表明,单个启发式方法的性能取决于蛋白质家族内多重比对的性质,并提出了一种策略,以获得有助于预测二级结构的表面和内部氨基酸分配。