Suppr超能文献

蛋白质结构中局部堆积基序的发现。

Discovery of local packing motifs in protein structures.

作者信息

Jonassen I, Eidhammer I, Taylor W R

机构信息

Department of Informatics, University of Bergen, Norway.

出版信息

Proteins. 1999 Feb 1;34(2):206-19.

Abstract

We present a language for describing structural patterns of residues in protein structures and a method for the discovery of such patterns that recur in a set of protein structures. The patterns impose restrictions on the spatial position of each residue, their order along the amino acid chain, and which amino acids are allowed in each position. Unlike other methods for comparing sets of protein structures, our method is not based on the use of pairwise structure comparisons which is often time consuming and can produce inconsistent results. Instead, the method simultaneously takes into account information from all structures in the search for conserved structure patterns which are potential structure motifs. The method is based on describing the spatial neighborhoods of each residue in each structure as a string and applying a sequence pattern discovery method to find patterns common to subsets of these strings. Finally it is checked whether the similarities between the neighborhood strings correspond to spatially similar substructures. We apply the method to analyze sets of very disparate proteins from the four different protein families: serine proteases, cuprodoxins, cysteine proteinases, and ferredoxins. The motifs found by the method correspond well to the site and motif information given in the annotation of these proteins in PDB, Swiss-Prot, and PROSITE. Furthermore, the motifs are confirmed by using the motif data to constrain the structural alignment of the proteins obtained with the program SAP. This gave the best superposition/alignment of the proteins given the motif assignment.

摘要

我们提出了一种用于描述蛋白质结构中残基结构模式的语言,以及一种用于发现一组蛋白质结构中反复出现的此类模式的方法。这些模式对每个残基的空间位置、它们在氨基酸链上的顺序以及每个位置允许哪些氨基酸施加了限制。与其他比较蛋白质结构集的方法不同,我们的方法不是基于耗时且可能产生不一致结果的成对结构比较。相反,该方法在搜索作为潜在结构基序的保守结构模式时,同时考虑来自所有结构的信息。该方法基于将每个结构中每个残基的空间邻域描述为一个字符串,并应用序列模式发现方法来找到这些字符串子集共有的模式。最后检查邻域字符串之间的相似性是否对应于空间上相似的子结构。我们应用该方法分析来自四个不同蛋白质家族的非常不同的蛋白质集:丝氨酸蛋白酶、铜蓝蛋白、半胱氨酸蛋白酶和铁氧化还原蛋白。该方法发现的基序与PDB、Swiss-Prot和PROSITE中这些蛋白质注释中给出的位点和基序信息非常吻合。此外,通过使用基序数据来约束用程序SAP获得的蛋白质的结构比对,对基序进行了确认。在给定基序分配的情况下,这给出了蛋白质的最佳叠加/比对。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验