Suppr超能文献

对已鉴定蛋白质序列中氨基酸对频率的概率分析。

Probabilistic analysis of the frequencies of amino acid pairs within characterized protein sequences.

作者信息

Shen Shiyi, Kai Bo, Ruan Jishou, Torin Huzil J, Carpenter Eric, Tuszynski Jack A

机构信息

College of Mathematical Science and LPMC, Nankai University, Tianjin 300071, PR China.

Department of Oncology, Division of Experimental Oncology, Cross Cancer Institute, University of Alberta, 11560 University Avenue, Edmonton, Canada AB T6G 1Z2.

出版信息

Physica A. 2006 Oct 15;370(2):651-662. doi: 10.1016/j.physa.2006.03.004. Epub 2006 Apr 3.

Abstract

Here, we describe a unique probabilistic evaluation of the 20, naturally occurring, amino acids and their distributions within the Swiss-Prot and Complete Human Genebank databases. We have developed a computational technique that imparts both directionality and length constraints into searches for unique combinations of amino acids within protein sequences. Using statistical approaches, we have carried out searches of all possible two- and three-residue motifs contained within these databases. This technique is based on the unusually high occurrence of a small number of these motifs when compared to the expected probability of finding a specific residue grouping within a given database. Subsequent filtering of this search to identify such unique combinations has provided several examples that can be used as markers to identify particular proteins within or across databases. We focus on three of these motifs, which were found to be of greatest interest to us. The CC, CM and a combination of the two, CCM motifs all occur either more or less frequently than would be predicted based on standard amino acid distributions within the entire human proteome.

摘要

在此,我们描述了对20种天然存在的氨基酸及其在Swiss-Prot和完整人类基因库数据库中的分布进行的独特概率评估。我们开发了一种计算技术,该技术在搜索蛋白质序列中氨基酸的独特组合时赋予方向性和长度限制。使用统计方法,我们对这些数据库中包含的所有可能的二残基和三残基基序进行了搜索。与在给定数据库中找到特定残基分组的预期概率相比,该技术基于少数这些基序的异常高出现率。对该搜索进行后续筛选以识别此类独特组合,提供了几个可用作标记物以识别数据库内或跨数据库的特定蛋白质的示例。我们专注于其中三个基序,发现它们对我们最具吸引力。CC、CM以及两者的组合CCM基序,其出现频率均高于或低于基于整个人类蛋白质组中标准氨基酸分布所预测的频率。

相似文献

5
Effective protein sequence comparison.有效的蛋白质序列比较。
Methods Enzymol. 1996;266:227-58. doi: 10.1016/s0076-6879(96)66017-0.
10
Discovering structural correlations in alpha-helices.发现α螺旋中的结构相关性。
Protein Sci. 1994 Oct;3(10):1847-57. doi: 10.1002/pro.5560031024.

引用本文的文献

10
Self-organization and entropy reduction in a living cell.活细胞中的自组织与熵减
Biosystems. 2013 Jan;111(1):1-10. doi: 10.1016/j.biosystems.2012.10.005. Epub 2012 Nov 15.

本文引用的文献

2
The Genome sequence of the SARS-associated coronavirus.与严重急性呼吸综合征相关的冠状病毒的基因组序列。
Science. 2003 May 30;300(5624):1399-404. doi: 10.1126/science.1085953. Epub 2003 May 1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验