Carugo Oliviero
Department of General Chemistry of the Pavia University, Viale Taramelli 12, I-27100 Pavia, Italy.
In Silico Biol. 2003;3(4):417-28.
A method is presented to predict those polypeptide segments within a globular protein that are more likely to be exposed to the solvent. The protein amino acidic sequence is the only information needed by this new algorithm. It uses a consensus hydrophobicity scale, derived from 28 known scales, and it is based on the comparison between the average hydrophobicity of a polypeptide fragment and the average hydrophobicity expected for a segment containing the same number of residues. The latter values are pre-computed from a non-redundant set of single chain protein structural domains. The comparison between the two average values results in a t value that readily provides the prediction with a statistical significance. A jack-knife validation analysis indicates that the protein segment predicted to be the most solvent exposed is actually solvent exposed and amongst the fragments that are most exposed. The source of a stand-alone program, written in C language, that allows the prediction of the most likely solvent exposed segment in a globular protein is available from the author.
本文提出了一种预测球状蛋白中更可能暴露于溶剂中的多肽片段的方法。该新算法所需的唯一信息是蛋白质的氨基酸序列。它使用了一种基于28种已知疏水性标度得出的一致疏水性标度,并且基于多肽片段的平均疏水性与包含相同数量残基的片段预期平均疏水性之间的比较。后一组值是从非冗余的单链蛋白质结构域集合中预先计算得出的。两个平均值之间的比较产生一个t值,该值很容易为预测提供统计学意义。留一法验证分析表明,预测为最易暴露于溶剂的蛋白质片段实际上是暴露于溶剂的,并且在最暴露的片段之中。作者提供了一个用C语言编写的独立程序的源代码,该程序可用于预测球状蛋白中最可能暴露于溶剂的片段。