Suppr超能文献

蛋白质组计划中利用N端和C端序列标签进行蛋白质鉴定。

Protein identification with N and C-terminal sequence tags in proteome projects.

作者信息

Wilkins M R, Gasteiger E, Tonella L, Ou K, Tyler M, Sanchez J C, Gooley A A, Walsh B J, Bairoch A, Appel R D, Williams K L, Hochstrasser D F

机构信息

Central Clinical Chemistry Laboratory, Geneva University Hospital, 24 Rue Micheli-du-Crest, Geneva 14, 1211, Switzerland.

出版信息

J Mol Biol. 1998 May 8;278(3):599-608. doi: 10.1006/jmbi.1998.1726.

Abstract

Genome sequences are available for increasing numbers of organisms. The proteomes (protein complement expressed by the genome) of many such organisms are being studied with two-dimensional (2D) gel electrophoresis. Here we have investigated the application of short N-terminal and C-terminal sequence tags to the identification of proteins separated on 2D gels. The theoretical N and C termini of 15, 519 proteins, representing all SWISS-PROT entries for the organisms Mycoplasma genitalium, Bacillus subtilis, Escherichia coli, Saccharomyces cerevisiae and human, were analysed. Sequence tags were found to be surprisingly specific, with N-terminal tags of four amino acid residues found to be unique for between 43% and 83% of proteins, and C-terminal tags of four amino acid residues unique for between 74% and 97% of proteins, depending on the species studied. Sequence tags of five amino acid residues were found to be even more specific. To utilise this specificity of sequence tags for protein identification, we created a world-wide web-accessible protein identification program, TagIdent (http://www.expasy.ch/www/tools.html), which matches sequence tags of up to six amino acid residues as well as estimated protein pI and mass against proteins in the SWISS-PROT database. We demonstrate the utility of this identification approach with sequence tags generated from 91 different E. coli proteins purified by 2D gel electrophoresis. Fifty-one proteins were unambiguously identified by virtue of their sequence tags and estimated pI and mass, and a further 11 proteins identified when sequence tags were combined with protein amino acid composition data. We conlcude that the TagIdent identification approach is best suited to the identification of proteins from prokaryotes whose complete genome sequences are available. The approach is less well suited to proteins from eukaryotes, as many eukaryotic proteins are not amenable to sequencing via Edman degradation, and tag protein identification cannot be unambiguous unless an organism's complete sequence is available.

摘要

越来越多生物的基因组序列已可获取。许多此类生物的蛋白质组(由基因组表达的蛋白质集合)正通过二维(2D)凝胶电泳进行研究。在此,我们研究了短的N端和C端序列标签在鉴定二维凝胶上分离的蛋白质中的应用。分析了代表生殖支原体、枯草芽孢杆菌、大肠杆菌、酿酒酵母和人类这几种生物所有SWISS - PROT条目的15519种蛋白质的理论N端和C端。发现序列标签具有惊人的特异性,根据所研究的物种不同,四个氨基酸残基的N端标签在43%至83%的蛋白质中是独特的,四个氨基酸残基的C端标签在74%至97%的蛋白质中是独特的。发现五个氨基酸残基的序列标签甚至更具特异性。为了利用序列标签的这种特异性来鉴定蛋白质,我们创建了一个可通过万维网访问蛋白质鉴定程序TagIdent(http://www.expasy.ch/www/tools.html),该程序将多达六个氨基酸残基的序列标签以及估计的蛋白质等电点和质量与SWISS - PROT数据库中的蛋白质进行匹配。我们用通过二维凝胶电泳纯化的91种不同大肠杆菌蛋白质产生的序列标签证明了这种鉴定方法的实用性。借助其序列标签以及估计的等电点和质量,明确鉴定出了51种蛋白质,当序列标签与蛋白质氨基酸组成数据相结合时,又鉴定出了另外11种蛋白质。我们得出结论,TagIdent鉴定方法最适合鉴定那些具有完整基因组序列的原核生物中的蛋白质。该方法不太适合真核生物的蛋白质,因为许多真核生物蛋白质不适合通过埃德曼降解进行测序,并且除非有生物的完整序列,否则标签蛋白质鉴定不可能明确。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验