Suppr超能文献

脯氨酸顺式肽键附近鉴别性序列模式的检测及其功能注释。

Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation.

作者信息

Exarchos Konstantinos P, Exarchos Themis P, Papaloukas Costas, Troganis Anastassios N, Fotiadis Dimitrios I

机构信息

Unit of Medical Technology and Intelligent Information Systems, Department of Computer Science, University of Ioannina, Ioannina, Greece.

出版信息

BMC Bioinformatics. 2009 Apr 20;10:113. doi: 10.1186/1471-2105-10-113.

Abstract

BACKGROUND

Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the trans conformation. In spite of their infrequent occurrence, cis peptide bonds play a key role in the protein structure and function, as well as in many significant biological processes.

RESULTS

We perform a systematic analysis of regions in protein sequences that contain a proline cis peptide bond in order to discover non-random associations between the primary sequence and the nature of proline cis/trans isomerization. For this purpose an efficient pattern discovery algorithm is employed which discovers regular expression-type patterns that are overrepresented (i.e. appear frequently repeated) in a set of sequences. Four types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, iii) pattern discovery using a structural equivalency set and iv) pattern discovery using certain amino acids' physicochemical properties. The extracted patterns are carefully validated using a specially implemented scoring function and a significance measure (i.e. log-probability estimate) indicative of their specificity. The score threshold for the first three types of pattern discovery is 0.90 while for the last type of pattern discovery 0.80. Regarding the significance measure, all patterns yielded values in the range [-9, -31] which ensure that the derived patterns are highly unlikely to have emerged by chance. Among the highest scoring patterns, most of them are consistent with previous investigations concerning the neighborhood of cis proline peptide bonds, and many new ones are identified. Finally, the extracted patterns are systematically compared against the PROSITE database, in order to gain insight into the functional implications of cis prolyl bonds.

CONCLUSION

Cis patterns with matches in the PROSITE database fell mostly into two main functional clusters: family signatures and protein signatures. However considerable propensity was also observed for targeting signals, active and phosphorylation sites as well as domain signatures.

摘要

背景

多肽由通过肽键共价连接的氨基酸组成。蛋白质中的大多数肽键以反式构象存在。尽管顺式肽键出现频率较低,但它们在蛋白质结构和功能以及许多重要生物过程中起着关键作用。

结果

我们对蛋白质序列中包含脯氨酸顺式肽键的区域进行了系统分析,以发现一级序列与脯氨酸顺/反异构化性质之间的非随机关联。为此,采用了一种高效的模式发现算法,该算法可发现一组序列中过度代表(即频繁重复出现)的正则表达式类型模式。进行了四种类型的模式发现:i)精确模式发现,ii)使用化学等价集的模式发现,iii)使用结构等价集的模式发现,iv)使用某些氨基酸理化性质的模式发现。使用专门实现的评分函数和表示其特异性的显著性度量(即对数概率估计)对提取的模式进行仔细验证。前三种模式发现的得分阈值为0.90,而最后一种模式发现的得分阈值为0.80。关于显著性度量,所有模式产生的值在[-9, -31]范围内,这确保了推导的模式极不可能是偶然出现的。在得分最高的模式中,大多数与先前关于顺式脯氨酸肽键邻域的研究一致,并且还识别出许多新的模式。最后,将提取的模式与PROSITE数据库进行系统比较,以深入了解顺式脯氨酰键的功能含义。

结论

在PROSITE数据库中匹配的顺式模式大多分为两个主要功能簇:家族特征和蛋白质特征。然而,在靶向信号、活性和磷酸化位点以及结构域特征方面也观察到了相当大的倾向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d2f/2678097/743bdc10c9e1/1471-2105-10-113-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验