Kunin V, Chan B, Sitbon E, Lithwick G, Pietrokovski S
Department of Molecular Genetics, The Weizmann Institute of Science, Rehovot, 76100, Israel.
J Mol Biol. 2001 Mar 30;307(3):939-49. doi: 10.1006/jmbi.2001.4466.
A new method to analyze the similarity between multiply aligned protein motifs (blocks) was developed. It identifies sets of consistently aligned blocks. These are found to be protein regions of similar function and structure that appear in different contexts. For example, the Rossmann fold ligand-binding region is found similar to TIM barrel and methylase regions, various protein families are predicted to have a TIM-barrel fold and the structural relation between the ClpP protease and crotonase folds is identified from their sequence. Besides identifying local structure features, sequence similarity across short sequence-regions (less than 20 amino acid regions) also predicts structure similarity of whole domains (folds) a few hundred amino acid residues long. Most of these relations could not be identified by other advanced sequence-to-sequence or sequence-to-multiple alignments comparisons. We describe the method (termed CYRCA), present examples of our findings, and discuss their implications.
开发了一种分析多重比对蛋白质基序(模块)之间相似性的新方法。它能识别出一致性比对模块的集合。这些模块被发现是出现在不同背景下、具有相似功能和结构的蛋白质区域。例如,发现罗斯曼折叠配体结合区域与TIM桶状结构和甲基化酶区域相似,预测各种蛋白质家族具有TIM桶状折叠结构,并且从序列中识别出ClpP蛋白酶和巴豆酸酶折叠结构之间的结构关系。除了识别局部结构特征外,短序列区域(少于20个氨基酸区域)的序列相似性还能预测几百个氨基酸残基长的整个结构域(折叠结构)的结构相似性。大多数这些关系无法通过其他先进的序列到序列或序列到多重比对比较来识别。我们描述了该方法(称为CYRCA),展示了我们的发现示例,并讨论了它们的意义。