质谱分析法能够直接鉴定大型基因组中的蛋白质。

Mass spectrometry allows direct identification of proteins in large genomes.

作者信息

Küster B, Mortensen P, Andersen J S, Mann M

机构信息

Protein Interaction Laboratory (PIL), University of Southern Denmark, Odense M, Denmark. MDS-Proteomics, Odense M, Denmark.

出版信息

Proteomics. 2001 May;1(5):641-50. doi: 10.1002/1615-9861(200104)1:5<641::AID-PROT641>3.0.CO;2-R.

DOI:10.1002/1615-9861(200104)1:5<641::AID-PROT641>3.0.CO;2-R

PMID:11678034

Abstract

Proteome projects seek to provide systematic functional analysis of the genes uncovered by genome sequencing initiatives. Mass spectrometric protein identification is a key requirement in these studies but to date, database searching tools rely on the availability of protein sequences derived from full length cDNA, expressed sequence tags or predicted open reading frames (ORFs) from genomic sequences. We demonstrate here that proteins can be identified directly in large genomic databases using peptide sequence tags obtained by tandem mass spectrometry. On the background of vast amounts of noncoding DNA sequence, identified peptides localize coding sequences (exons) in a confined region of the genome, which contains the cognate gene. The approach does not require prior information about putative ORFs as predicted by computerized gene finding algorithms. The method scales to the complete human genome and allows identification, mapping, cloning and assistance in gene prediction of any protein for which minimal mass spectrometric information can be obtained. Several novel proteins from Arabidopsis thaliana and human have been discovered in this way.

摘要

蛋白质组计划旨在对基因组测序计划所发现的基因进行系统的功能分析。质谱蛋白质鉴定是这些研究中的一项关键要求，但迄今为止，数据库搜索工具依赖于来自全长cDNA、表达序列标签或基因组序列预测的开放阅读框（ORF）的蛋白质序列。我们在此证明，使用串联质谱获得的肽序列标签可以直接在大型基因组数据库中鉴定蛋白质。在大量非编码DNA序列的背景下，鉴定出的肽在基因组的一个受限区域内定位编码序列（外显子），该区域包含同源基因。该方法不需要计算机化基因发现算法预测的关于假定ORF的先验信息。该方法可扩展到完整的人类基因组，并允许对任何能够获得最少质谱信息的蛋白质进行鉴定、定位、克隆和基因预测辅助。通过这种方式已经发现了几种来自拟南芥和人类的新型蛋白质。

相似文献

Mass spectrometry allows direct identification of proteins in large genomes.

Proteomics. 2001 May;1(5):641-50. doi: 10.1002/1615-9861(200104)1:5<641::AID-PROT641>3.0.CO;2-R.

PepLine: a software pipeline for high-throughput direct mapping of tandem mass spectrometry data on genomic sequences.

J Proteome Res. 2008 May;7(5):1873-83. doi: 10.1021/pr070415k. Epub 2008 Mar 19.

Genome annotation of Anopheles gambiae using mass spectrometry-derived data.

BMC Genomics. 2005 Sep 19;6:128. doi: 10.1186/1471-2164-6-128.

Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics.

Science. 2008 May 16;320(5878):938-41. doi: 10.1126/science.1157956. Epub 2008 Apr 24.

Interrogating the human genome using uninterpreted mass spectrometry data.

Proteomics. 2001 May;1(5):651-67. doi: 10.1002/1615-9861(200104)1:5<651::AID-PROT651>3.0.CO;2-N.

[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].

Yi Chuan Xue Bao. 2004 May;31(5):431-43.

Analysis of the Arabidopsis cytosolic ribosome proteome provides detailed insights into its components and their post-translational modification.

Mol Cell Proteomics. 2008 Feb;7(2):347-69. doi: 10.1074/mcp.M700052-MCP200. Epub 2007 Oct 13.

Analysis of the Arabidopsis mitochondrial proteome.

Plant Physiol. 2001 Dec;127(4):1711-27.

Proteome research: complementarity and limitations with respect to the RNA and DNA worlds.

Electrophoresis. 1997 Aug;18(8):1217-42. doi: 10.1002/elps.1150180804.

Mass spectrometric identification and microcharacterization of proteins from electrophoretic gels: strategies and applications.

Proteins. 1998;Suppl 2:74-89. doi: 10.1002/(sici)1097-0134(1998)33:2+<74::aid-prot9>3.3.co;2-2.

引用本文的文献

Peptimapper: proteogenomics workflow for the expert annotation of eukaryotic genomes.

BMC Genomics. 2019 Jan 17;20(1):56. doi: 10.1186/s12864-019-5431-9.

Application of LC-MS/MS MRM to Determine Staphylococcal Enterotoxins (SEB and SEA) in Milk.

Toxins (Basel). 2016 Apr 20;8(4):118. doi: 10.3390/toxins8040118.

Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation.

Annu Rev Anal Chem (Palo Alto Calif). 2016 Jun 12;9(1):521-45. doi: 10.1146/annurev-anchem-071015-041722. Epub 2016 Mar 30.

The bacterial proteogenomic pipeline.

BMC Genomics. 2014;15 Suppl 9(Suppl 9):S19. doi: 10.1186/1471-2164-15-S9-S19. Epub 2014 Dec 8.

Proteogenomics: concepts, applications and computational strategies.

Nat Methods. 2014 Nov;11(11):1114-25. doi: 10.1038/nmeth.3144.

Brain proteomics of Anopheles gambiae.

OMICS. 2014 Jul;18(7):421-37. doi: 10.1089/omi.2014.0007. Epub 2014 Jun 17.

Deep coverage of the Escherichia coli proteome enables the assessment of false discovery rates in simple proteogenomic experiments.

Mol Cell Proteomics. 2013 Nov;12(11):3420-30. doi: 10.1074/mcp.M113.029165. Epub 2013 Aug 1.

Whole human genome proteogenomic mapping for ENCODE cell line data: identifying protein-coding regions.

BMC Genomics. 2013 Feb 28;14:141. doi: 10.1186/1471-2164-14-141.

Inference and validation of protein identifications.

Mol Cell Proteomics. 2012 Nov;11(11):1097-104. doi: 10.1074/mcp.R111.014795. Epub 2012 Aug 3.

Proteomic profiling of the planarian Schmidtea mediterranea and its mucous reveals similarities with human secretions and those predicted for parasitic flatworms.

Mol Cell Proteomics. 2012 Sep;11(9):681-91. doi: 10.1074/mcp.M112.019026. Epub 2012 May 31.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

质谱分析法能够直接鉴定大型基因组中的蛋白质。

Mass spectrometry allows direct identification of proteins in large genomes.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献