• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

失而复得:重新研究和重新评分蛋白质组学数据有助于基因组注释并提高蛋白质组覆盖率。

Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage.

作者信息

Willems Patrick, Fijalkowski Igor, Van Damme Petra

机构信息

Department of Biochemistry and Microbiology, Ghent University, Ghent, Belgium.

Department of Biochemistry and Microbiology, Ghent University, Ghent, Belgium

出版信息

mSystems. 2020 Oct 27;5(5):e00833-20. doi: 10.1128/mSystems.00833-20.

DOI:10.1128/mSystems.00833-20
PMID:33109751
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7593589/
Abstract

Prokaryotic genome annotation is heavily dependent on automated gene annotation pipelines that are prone to propagate errors and underestimate genome complexity. We describe an optimized proteogenomic workflow that uses ribosome profiling (ribo-seq) and proteomic data for serovar Typhimurium to identify unannotated proteins or alternative protein forms. This data analysis encompasses the searching of cofragmenting peptides and postprocessing with extended peptide-to-spectrum quality features, including comparison to predicted fragment ion intensities. When this strategy is applied, an enhanced proteome depth is achieved, as well as greater confidence for unannotated peptide hits. We demonstrate the general applicability of our pipeline by reanalyzing public data sets. Taken together, our results show that systematic reanalysis using available prokaryotic (proteome) data sets holds great promise to assist in experimentally based genome annotation. Delineation of open reading frames (ORFs) causes persistent inconsistencies in prokaryote genome annotation. We demonstrate that by advanced (re)analysis of omics data, a higher proteome coverage and sensitive detection of unannotated ORFs can be achieved, which can be exploited for conditional bacterial genome (re)annotation, which is especially relevant in view of annotating the wealth of sequenced prokaryotic genomes obtained in recent years.

摘要

原核生物基因组注释严重依赖于自动化基因注释流程,而这些流程容易传播错误并低估基因组复杂性。我们描述了一种优化的蛋白质基因组学工作流程,该流程使用鼠伤寒血清型的核糖体图谱分析(ribo-seq)和蛋白质组学数据来鉴定未注释的蛋白质或替代蛋白质形式。这种数据分析包括搜索共片段化肽段以及使用扩展的肽段与谱图质量特征进行后处理,包括与预测的碎片离子强度进行比较。当应用此策略时,可实现增强的蛋白质组深度,以及对未注释肽段匹配的更高置信度。我们通过重新分析公共数据集证明了我们流程的普遍适用性。综上所述,我们的结果表明,使用可用的原核生物(蛋白质组)数据集进行系统的重新分析有望极大地辅助基于实验的基因组注释。开放阅读框(ORF)的划定在原核生物基因组注释中导致了持续的不一致性。我们证明了通过对组学数据进行高级(重新)分析,可以实现更高的蛋白质组覆盖率和对未注释ORF的灵敏检测,并可将其用于条件性细菌基因组(重新)注释,鉴于近年来获得了大量已测序的原核生物基因组,这一点尤为重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/f36a061c19a2/mSystems.00833-20-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/aec96377ccf8/mSystems.00833-20-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/82bf194406e2/mSystems.00833-20-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/865015cd6e02/mSystems.00833-20-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/d53e124529a0/mSystems.00833-20-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/c0a674210ad2/mSystems.00833-20-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/1a8570420e73/mSystems.00833-20-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/7cd0b1cfb2e5/mSystems.00833-20-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/f36a061c19a2/mSystems.00833-20-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/aec96377ccf8/mSystems.00833-20-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/82bf194406e2/mSystems.00833-20-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/865015cd6e02/mSystems.00833-20-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/d53e124529a0/mSystems.00833-20-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/c0a674210ad2/mSystems.00833-20-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/1a8570420e73/mSystems.00833-20-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/7cd0b1cfb2e5/mSystems.00833-20-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aafb/7593589/f36a061c19a2/mSystems.00833-20-f0008.jpg

相似文献

1
Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage.失而复得:重新研究和重新评分蛋白质组学数据有助于基因组注释并提高蛋白质组覆盖率。
mSystems. 2020 Oct 27;5(5):e00833-20. doi: 10.1128/mSystems.00833-20.
2
REPARATION: ribosome profiling assisted (re-)annotation of bacterial genomes.REPARATION:核糖体谱分析辅助的细菌基因组(重新)注释
Nucleic Acids Res. 2017 Nov 16;45(20):e168. doi: 10.1093/nar/gkx758.
3
Proteins à la carte: riboproteogenomic exploration of bacterial N-terminal proteoform expression.按需定制的蛋白质:细菌 N 端蛋白表型表达的核糖体蛋白质组学探索。
mBio. 2024 Apr 10;15(4):e0033324. doi: 10.1128/mbio.00333-24. Epub 2024 Mar 21.
4
Small Protein Enrichment Improves Proteomics Detection of sORF Encoded Polypeptides.小蛋白富集改善了对小开放阅读框编码多肽的蛋白质组学检测。
Front Genet. 2021 Oct 15;12:713400. doi: 10.3389/fgene.2021.713400. eCollection 2021.
5
Deep proteome coverage based on ribosome profiling aids mass spectrometry-based protein and peptide discovery and provides evidence of alternative translation products and near-cognate translation initiation events.基于核糖体分析的深度蛋白质组覆盖可辅助基于质谱的蛋白质和肽发现,并提供替代翻译产物和近同源翻译起始事件的证据。
Mol Cell Proteomics. 2013 Jul;12(7):1780-90. doi: 10.1074/mcp.M113.027540. Epub 2013 Feb 21.
6
An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics.通过蛋白质基因组学鉴定原核基因组全部蛋白质编码潜能的综合策略。
Genome Res. 2017 Dec;27(12):2083-2095. doi: 10.1101/gr.218255.116. Epub 2017 Nov 15.
7
Integrated Transcriptomic-Proteomic Analysis Using a Proteogenomic Workflow Refines Rat Genome Annotation.使用蛋白质基因组学工作流程的综合转录组学-蛋白质组学分析优化大鼠基因组注释。
Mol Cell Proteomics. 2016 Jan;15(1):329-39. doi: 10.1074/mcp.M114.047126. Epub 2015 Nov 11.
8
Identification of Unannotated Small Genes in .在……中未注释小基因的鉴定
G3 (Bethesda). 2017 Mar 10;7(3):983-989. doi: 10.1534/g3.116.036939.
9
Proteogenomic mapping as a complementary method to perform genome annotation.蛋白质基因组图谱绘制作为一种用于进行基因组注释的补充方法。
Proteomics. 2004 Jan;4(1):59-77. doi: 10.1002/pmic.200300511.
10
Comparative omics-driven genome annotation refinement: application across Yersiniae.比较组学驱动的基因组注释精细化:在耶尔森氏菌中的应用。
PLoS One. 2012;7(3):e33903. doi: 10.1371/journal.pone.0033903. Epub 2012 Mar 27.

引用本文的文献

1
De novo gene birth and the conundrum of ORFan genes in bacteria.细菌中的从头基因诞生与孤儿基因难题
Genome Res. 2025 Aug 1;35(8):1679-1688. doi: 10.1101/gr.280157.124.
2
Towards a kingdom of reproductive life - the core sperm proteome.迈向生殖生命王国——核心精子蛋白质组
Reproduction. 2025 May 10;169(6). doi: 10.1530/REP-25-0105. Print 2025 Jun 1.
3
Comparative genomics of : unveiling genetic discrepancies between ATCC 13939K and BAA-816 strains.关于……的比较基因组学:揭示ATCC 13939K菌株与BAA - 816菌株之间的遗传差异 。 (注:原英文文本中“Comparative genomics of :”这里冒号前缺少具体内容,翻译时根据语境补充了“关于……”,以使译文更通顺达意。)

本文引用的文献

1
Use of Hybrid Data-Dependent and -Independent Acquisition Spectral Libraries Empowers Dual-Proteome Profiling.使用混合数据相关和非相关采集谱库实现双蛋白质组分析。
J Proteome Res. 2021 Feb 5;20(2):1165-1177. doi: 10.1021/acs.jproteome.0c00350. Epub 2021 Jan 19.
2
Bacterial riboproteogenomics: the era of N-terminal proteoform existence revealed.细菌核糖体蛋白组学:N 端蛋白异构体存在的时代被揭示。
FEMS Microbiol Rev. 2020 Jul 1;44(4):418-431. doi: 10.1093/femsre/fuaa013.
3
ThermoRawFileParser: Modular, Scalable, and Cross-Platform RAW File Conversion.
Front Microbiol. 2024 Jun 19;15:1410024. doi: 10.3389/fmicb.2024.1410024. eCollection 2024.
4
Proteins à la carte: riboproteogenomic exploration of bacterial N-terminal proteoform expression.按需定制的蛋白质:细菌 N 端蛋白表型表达的核糖体蛋白质组学探索。
mBio. 2024 Apr 10;15(4):e0033324. doi: 10.1128/mbio.00333-24. Epub 2024 Mar 21.
5
Exposing the small protein load of bacterial life.揭示细菌生命的小蛋白质负荷。
FEMS Microbiol Rev. 2023 Nov 1;47(6). doi: 10.1093/femsre/fuad063.
6
Hidden in plain sight: challenges in proteomics detection of small ORF-encoded polypeptides.隐匿于众目睽睽之下:小开放阅读框编码多肽的蛋白质组学检测挑战
Microlife. 2022 May 14;3:uqac005. doi: 10.1093/femsml/uqac005. eCollection 2022.
7
An analysis of proteogenomics and how and when transcriptome-informed reduction of protein databases can enhance eukaryotic proteomics.蛋白质基因组学分析,以及转录组信息如何以及何时减少蛋白质数据库可增强真核蛋白质组学。
Genome Biol. 2022 Jun 20;23(1):132. doi: 10.1186/s13059-022-02701-2.
8
Spotlight on alternative frame coding: Two long overlapping genes in are translated and under purifying selection.另类框架编码聚焦:[具体物种]中的两个长重叠基因被翻译且处于纯化选择之下。
iScience. 2022 Feb 1;25(2):103844. doi: 10.1016/j.isci.2022.103844. eCollection 2022 Feb 18.
9
RiboReport - benchmarking tools for ribosome profiling-based identification of open reading frames in bacteria.RiboReport-基于核糖体图谱的细菌开放阅读框鉴定的基准工具。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab549.
10
A Practical Guide to Small Protein Discovery and Characterization Using Mass Spectrometry.使用质谱技术进行小蛋白发现和鉴定的实用指南。
J Bacteriol. 2022 Jan 18;204(1):e0035321. doi: 10.1128/JB.00353-21. Epub 2021 Nov 8.
ThermoRawFileParser:模块化、可扩展且跨平台的 RAW 文件转换。
J Proteome Res. 2020 Jan 3;19(1):537-542. doi: 10.1021/acs.jproteome.9b00328. Epub 2019 Dec 6.
4
The diversity and commonalities of the radiation-resistance mechanisms of Deinococcus and its up-to-date applications.嗜放射菌抗辐射机制的多样性与共性及其最新应用
AMB Express. 2019 Sep 3;9(1):138. doi: 10.1186/s13568-019-0862-x.
5
Large-Scale Analyses of Human Microbiomes Reveal Thousands of Small, Novel Genes.大规模人类微生物组分析揭示了数千个小型新基因。
Cell. 2019 Aug 22;178(5):1245-1259.e14. doi: 10.1016/j.cell.2019.07.016. Epub 2019 Aug 8.
6
High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis.高质量 MS/MS 谱预测,用于数据依赖和数据独立采集数据分析。
Nat Methods. 2019 Jun;16(6):519-525. doi: 10.1038/s41592-019-0427-6. Epub 2019 May 27.
7
Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning.Prosit:基于深度学习的肽串联质谱的蛋白质组范围预测。
Nat Methods. 2019 Jun;16(6):509-518. doi: 10.1038/s41592-019-0426-7. Epub 2019 May 27.
8
Proteomic and Metabolomic Profiling of Recovering After Exposure to Simulated Low Earth Orbit Vacuum Conditions.暴露于模拟低地球轨道真空条件后恢复过程中的蛋白质组学和代谢组学分析
Front Microbiol. 2019 Apr 29;10:909. doi: 10.3389/fmicb.2019.00909. eCollection 2019.
9
Next-generation genome annotation: we still struggle to get it right.下一代基因组注释:我们仍在努力做到正确。
Genome Biol. 2019 May 16;20(1):92. doi: 10.1186/s13059-019-1715-2.
10
Accurate peptide fragmentation predictions allow data driven approaches to replace and improve upon proteomics search engine scoring functions.准确的肽段碎裂预测可使数据驱动的方法取代并改进蛋白质组学搜索引擎评分函数。
Bioinformatics. 2019 Dec 15;35(24):5243-5248. doi: 10.1093/bioinformatics/btz383.