一种自动化的蛋白质基因组学方法利用质谱技术揭示了玉米中的新基因。

An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays.

机构信息

Department of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92092;

出版信息

Mol Cell Proteomics. 2014 Jan;13(1):157-67. doi: 10.1074/mcp.M113.031260. Epub 2013 Oct 18.

DOI:10.1074/mcp.M113.031260

PMID:24142994

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3879611/

Abstract

New technologies in genomics and proteomics have influenced the emergence of proteogenomics, a field at the confluence of genomics, transcriptomics, and proteomics. First generation proteogenomic toolkits employ peptide mass spectrometry to identify novel protein coding regions. We extend first generation proteogenomic tools to achieve greater accuracy and enable the analysis of large, complex genomes. We apply our pipeline to Zea mays, which has a genome comparable in size to human. Our pipeline begins with the comparison of mass spectra to a putative translation of the genome. We select novel peptides, those that match a region of the genome that was not previously known to be protein coding, for grouping into refinement events. We present a novel, probabilistic framework for evaluating the accuracy of each event. Our calculated event probability, or eventProb, considers the number of supporting peptides and spectra, and the quality of each supporting peptide-spectrum match. Our pipeline predicts 165 novel protein-coding genes and proposes updated models for 741 additional genes.

摘要

基因组学和蛋白质组学领域的新技术推动了蛋白质基因组学的出现，蛋白质基因组学是基因组学、转录组学和蛋白质组学的交汇点。第一代蛋白质基因组学工具包采用肽质量色谱法来鉴定新的蛋白质编码区域。我们扩展了第一代蛋白质基因组学工具，以提高准确性并实现对大型复杂基因组的分析。我们将我们的方法应用于玉米，其基因组大小与人类相当。我们的方法从将质谱与基因组的假定翻译进行比较开始。我们选择新的肽，即与先前未知的蛋白质编码区域匹配的肽，将其分组到精细事件中。我们提出了一种新颖的、概率性的框架来评估每个事件的准确性。我们计算的事件概率或 eventProb 考虑了支持肽和谱的数量，以及每个支持肽谱匹配的质量。我们的方法预测了 165 个新的蛋白质编码基因，并为 741 个额外基因提出了更新的模型。

相似文献

An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays.

Mol Cell Proteomics. 2014 Jan;13(1):157-67. doi: 10.1074/mcp.M113.031260. Epub 2013 Oct 18.

Multi-omics Visualization Platform: An extensible Galaxy plug-in for multi-omics data visualization and exploration.

Gigascience. 2020 Apr 1;9(4). doi: 10.1093/gigascience/giaa025.

Gene model detection using mass spectrometry.

Methods Mol Biol. 2010;604:137-44. doi: 10.1007/978-1-60761-444-9_10.

Mass spectrometry at the interface of proteomics and genomics.

Mol Biosyst. 2011 Feb;7(2):284-91. doi: 10.1039/c0mb00168f. Epub 2010 Oct 21.

Proteogenomic Methods to Improve Genome Annotation.

Methods Mol Biol. 2016;1410:77-89. doi: 10.1007/978-1-4939-3524-6_5.

High-throughput proteogenomics of Ruegeria pomeroyi: seeding a better genomic annotation for the whole marine Roseobacter clade.

BMC Genomics. 2012 Feb 15;13:73. doi: 10.1186/1471-2164-13-73.

The bacterial proteogenomic pipeline.

BMC Genomics. 2014;15 Suppl 9(Suppl 9):S19. doi: 10.1186/1471-2164-15-S9-S19. Epub 2014 Dec 8.

Proteogenomics to discover the full coding content of genomes: a computational perspective.

J Proteomics. 2010 Oct 10;73(11):2124-35. doi: 10.1016/j.jprot.2010.06.007. Epub 2010 Jul 8.

Proteogenomics from a bioinformatics angle: A growing field.

Mass Spectrom Rev. 2017 Sep;36(5):584-599. doi: 10.1002/mas.21483. Epub 2015 Dec 15.

Proteogenomic analysis of Bradyrhizobium japonicum USDA110 using GenoSuite, an automated multi-algorithmic pipeline.

Mol Cell Proteomics. 2013 Nov;12(11):3388-97. doi: 10.1074/mcp.M112.027169. Epub 2013 Jul 23.

引用本文的文献

Improving the Genome Annotation of Using Proteogenomics.

Curr Genomics. 2021 Dec 30;22(5):373-383. doi: 10.2174/1389202922666211011143957.

Proteogenomic Analysis Provides Novel Insight into Genome Annotation and Nitrogen Metabolism in sp. PCC 7120.

Microbiol Spectr. 2021 Oct 31;9(2):e0049021. doi: 10.1128/Spectrum.00490-21. Epub 2021 Sep 15.

The Arabidopsis PeptideAtlas: Harnessing worldwide proteomics data to create a comprehensive community proteomics resource.

Plant Cell. 2021 Nov 4;33(11):3421-3453. doi: 10.1093/plcell/koab211.

Methods for Proteogenomics Data Analysis, Challenges, and Scalability Bottlenecks: A Survey.

IEEE Access. 2021;9:5497-5516. doi: 10.1109/ACCESS.2020.3047588. Epub 2020 Dec 25.

Full-Length Transcript-Based Proteogenomics of Rice Improves Its Genome and Proteome Annotation.

Plant Physiol. 2020 Mar;182(3):1510-1526. doi: 10.1104/pp.19.00430. Epub 2019 Dec 19.

Combination of Proteogenomics with Peptide Sequencing Identifies New Genes and Hidden Posttranscriptional Modifications.

mBio. 2019 Oct 15;10(5):e02367-19. doi: 10.1128/mBio.02367-19.

Large Scale Profiling of Protein Isoforms Using Label-Free Quantitative Proteomics Revealed the Regulation of Nonsense-Mediated Decay in Moso Bamboo ().

Cells. 2019 Jul 19;8(7):744. doi: 10.3390/cells8070744.

The Tomato Translational Landscape Revealed by Transcriptome Assembly and Ribosome Profiling.

Plant Physiol. 2019 Sep;181(1):367-380. doi: 10.1104/pp.19.00541. Epub 2019 Jun 27.

Alternative splicing and translation play important roles in hypoxic germination in rice.

J Exp Bot. 2019 Feb 5;70(3):817-833. doi: 10.1093/jxb/ery393.

Improvements to the Rice Genome Annotation Through Large-Scale Analysis of RNA-Seq and Proteomics Data Sets.

Mol Cell Proteomics. 2019 Jan;18(1):86-98. doi: 10.1074/mcp.RA118.000832. Epub 2018 Oct 6.

本文引用的文献

Reconstruction of protein networks from an atlas of maize seed proteotypes.

Proc Natl Acad Sci U S A. 2013 Dec 3;110(49):E4808-17. doi: 10.1073/pnas.1319113110. Epub 2013 Nov 18.

False discovery rates in spectral identification.

BMC Bioinformatics. 2012;13 Suppl 16(Suppl 16):S2. doi: 10.1186/1471-2105-13-S16-S2. Epub 2012 Nov 5.

MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.

BMC Bioinformatics. 2011 Dec 22;12:491. doi: 10.1186/1471-2105-12-491.

Proteogenomic analysis of bacteria and archaea: a 46 organism case study.

PLoS One. 2011;6(11):e27587. doi: 10.1371/journal.pone.0027587. Epub 2011 Nov 17.

Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry.

Mol Cell Proteomics. 2011 Dec;10(12):M111.011627. doi: 10.1074/mcp.M111.011445. Epub 2011 Oct 3.

Building and searching tandem mass spectral libraries for peptide identification.

Mol Cell Proteomics. 2011 Dec;10(12):R111.008565. doi: 10.1074/mcp.R111.008565. Epub 2011 Sep 6.

The developmental dynamics of the maize leaf transcriptome.

Nat Genet. 2010 Dec;42(12):1060-7. doi: 10.1038/ng.703. Epub 2010 Oct 31.

Proteogenomics to discover the full coding content of genomes: a computational perspective.

J Proteomics. 2010 Oct 10;73(11):2124-35. doi: 10.1016/j.jprot.2010.06.007. Epub 2010 Jul 8.

The B73 maize genome: complexity, diversity, and dynamics.

Science. 2009 Nov 20;326(5956):1112-5. doi: 10.1126/science.1178534.

Discovery and revision of Arabidopsis genes by proteogenomics.

Proc Natl Acad Sci U S A. 2008 Dec 30;105(52):21034-8. doi: 10.1073/pnas.0811066106. Epub 2008 Dec 19.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种自动化的蛋白质基因组学方法利用质谱技术揭示了玉米中的新基因。

An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays.

机构信息

Department of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92092;

出版信息

Mol Cell Proteomics. 2014 Jan;13(1):157-67. doi: 10.1074/mcp.M113.031260. Epub 2013 Oct 18.

DOI:10.1074/mcp.M113.031260

PMID:24142994

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3879611/

Abstract

摘要

一种自动化的蛋白质基因组学方法利用质谱技术揭示了玉米中的新基因。

An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

一种自动化的蛋白质基因组学方法利用质谱技术揭示了玉米中的新基因。

An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays.

机构信息

出版信息