使用聚类质谱友好型数据库对原核生物中的多态性和基因注释差异进行蛋白质基因组分析。

Proteogenomic analysis of polymorphisms and gene annotation divergences in prokaryotes using a clustered mass spectrometry-friendly database.

机构信息

The Gade Institute, Section for Microbiology and Immunology, University of Bergen, N-5021 Bergen, Norway.

出版信息

Mol Cell Proteomics. 2011 Jan;10(1):M110.002527. doi: 10.1074/mcp.M110.002527. Epub 2010 Oct 28.

DOI:10.1074/mcp.M110.002527

PMID:21030493

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3013451/

Abstract

Precise annotation of genes or open reading frames is still a difficult task that results in divergence even for data generated from the same genomic sequence. This has an impact in further proteomic studies, and also compromises the characterization of clinical isolates with many specific genetic variations that may not be represented in the selected database. We recently developed software called multistrain mass spectrometry prokaryotic database builder (MSMSpdbb) that can merge protein databases from several sources and be applied on any prokaryotic organism, in a proteomic-friendly approach. We generated a database for the Mycobacterium tuberculosis complex (using three strains of Mycobacterium bovis and five of M. tuberculosis), and analyzed data collected from two laboratory strains and two clinical isolates of M. tuberculosis. We identified 2561 proteins, of which 24 were present in M. tuberculosis H37Rv samples, but not annotated in the M. tuberculosis H37Rv genome. We were also able to identify 280 nonsynonymous single amino acid polymorphisms and confirm 367 translational start sites. As a proof of concept we applied the database to whole-genome DNA sequencing data of one of the clinical isolates, which allowed the validation of 116 predicted single amino acid polymorphisms and the annotation of 131 N-terminal start sites. Moreover we identified regions not present in the original M. tuberculosis H37Rv sequence, indicating strain divergence or errors in the reference sequence. In conclusion, we demonstrated the potential of using a merged database to better characterize laboratory or clinical bacterial strains.

摘要

精确注释基因或开放阅读框仍然是一项艰巨的任务，即使对于来自同一基因组序列的数据也会导致分歧。这对进一步的蛋白质组学研究有影响，也会影响对具有许多特定遗传变异的临床分离株的特征描述，而这些变异可能在所选数据库中没有得到体现。我们最近开发了一种名为多株质谱原核数据库构建器（MSMSpdbb）的软件，可以合并来自多个来源的蛋白质数据库，并以蛋白质组学友好的方式应用于任何原核生物。我们为结核分枝杆菌复合体生成了一个数据库（使用了 3 株牛分枝杆菌和 5 株结核分枝杆菌），并分析了来自两个实验室株和两个结核分枝杆菌临床分离株的数据。我们鉴定了 2561 种蛋白质，其中 24 种存在于结核分枝杆菌 H37Rv 样本中，但在结核分枝杆菌 H37Rv 基因组中未注释。我们还能够鉴定出 280 个非同义单氨基酸多态性，并确认了 367 个翻译起始位点。作为概念验证，我们将该数据库应用于其中一个临床分离株的全基因组 DNA 测序数据，从而验证了 116 个预测的单氨基酸多态性，并注释了 131 个 N 端起始位点。此外，我们还鉴定了在原始结核分枝杆菌 H37Rv 序列中不存在的区域，表明菌株的差异或参考序列的错误。总之，我们证明了使用合并数据库更好地描述实验室或临床细菌株的潜力。

相似文献

Proteogenomic analysis of polymorphisms and gene annotation divergences in prokaryotes using a clustered mass spectrometry-friendly database.

Mol Cell Proteomics. 2011 Jan;10(1):M110.002527. doi: 10.1074/mcp.M110.002527. Epub 2010 Oct 28.

Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry.

Mol Cell Proteomics. 2011 Dec;10(12):M111.011627. doi: 10.1074/mcp.M111.011445. Epub 2011 Oct 3.

Proteogenomic analysis of Mycobacterium tuberculosis Beijing B0/W148 cluster strains.

J Proteomics. 2019 Feb 10;192:18-26. doi: 10.1016/j.jprot.2018.07.002. Epub 2018 Jul 24.

Reannotation of translational start sites in the genome of Mycobacterium tuberculosis.

Tuberculosis (Edinb). 2013 Jan;93(1):18-25. doi: 10.1016/j.tube.2012.11.012. Epub 2012 Dec 26.

High accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using Mycobacterium tuberculosis as an example.

BMC Genomics. 2008 Jul 2;9:316. doi: 10.1186/1471-2164-9-316.

MSMSpdbb: providing protein databases of closely related organisms to improve proteomic characterization of prokaryotic microbes.

Bioinformatics. 2010 Mar 1;26(5):698-9. doi: 10.1093/bioinformatics/btq004. Epub 2010 Jan 14.

Proteomics reveals open reading frames in Mycobacterium tuberculosis H37Rv not predicted by genomics.

Infect Immun. 2001 Sep;69(9):5905-7. doi: 10.1128/IAI.69.9.5905-5907.2001.

A new high-throughput AFLP approach for identification of new genetic polymorphism in the genome of the clonal microorganism Mycobacterium tuberculosis.

J Microbiol Methods. 2004 Jan;56(1):49-62. doi: 10.1016/j.mimet.2003.09.018.

Overview of errors in the reference sequence and annotation of Mycobacterium tuberculosis H37Rv, and variation amongst its isolates.

Infect Genet Evol. 2012 Jun;12(4):807-10. doi: 10.1016/j.meegid.2011.06.011. Epub 2011 Jun 23.

Variation of the Mycobacterium tuberculosis PE_PGRS 33 gene among clinical isolates.

J Clin Microbiol. 2005 Oct;43(10):4954-60. doi: 10.1128/JCM.43.10.4954-4960.2005.

引用本文的文献

Selective inhibition of Mycobacterium tuberculosis GpsI unveils a novel strategy to target the RNA metabolism.

Nucleic Acids Res. 2025 Jun 6;53(11). doi: 10.1093/nar/gkaf529.

Cryo-EM of native membranes reveals an intimate connection between the Krebs cycle and aerobic respiration in mycobacteria.

Proc Natl Acad Sci U S A. 2025 Feb 25;122(8):e2423761122. doi: 10.1073/pnas.2423761122. Epub 2025 Feb 19.

Mirror proteases of Ac-Trypsin and Ac-LysargiNase precisely improve novel event identifications in MC 155 by proteogenomic analysis.

Front Microbiol. 2022 Oct 12;13:1015140. doi: 10.3389/fmicb.2022.1015140. eCollection 2022.

A Practical Guide to Small Protein Discovery and Characterization Using Mass Spectrometry.

J Bacteriol. 2022 Jan 18;204(1):e0035321. doi: 10.1128/JB.00353-21. Epub 2021 Nov 8.

Rv0927c Inhibits NF-κB Pathway by Downregulating the Phosphorylation Level of IκBα and Enhances Mycobacterial Survival.

Front Immunol. 2021 Aug 31;12:721370. doi: 10.3389/fimmu.2021.721370. eCollection 2021.

Identification and architecture of a putative secretion tube across mycobacterial outer envelope.

Sci Adv. 2021 Aug 20;7(34). doi: 10.1126/sciadv.abg5656. Print 2021 Aug.

Immunization with -Specific Antigens Bypasses T Cell Differentiation from Prior Bacillus Calmette-Guérin Vaccination and Improves Protection in Mice.

J Immunol. 2020 Oct 15;205(8):2146-2155. doi: 10.4049/jimmunol.2000563. Epub 2020 Sep 4.

Empowering Shotgun Mass Spectrometry with 2DE: A HepG2 Study.

Int J Mol Sci. 2020 May 27;21(11):3813. doi: 10.3390/ijms21113813.

Requires Cholesterol Oxidase to Disrupt TLR2 Signalling in Human Macrophages.

Mediators Inflamm. 2019 Dec 1;2019:2373791. doi: 10.1155/2019/2373791. eCollection 2019.

Genome annotation improvements from cross-phyla proteogenomics and time-of-day differences in malaria mosquito proteins using untargeted quantitative proteomics.

PLoS One. 2019 Jul 29;14(7):e0220225. doi: 10.1371/journal.pone.0220225. eCollection 2019.

本文引用的文献

Definition of novel cell envelope associated proteins in Triton X-114 extracts of Mycobacterium tuberculosis H37Rv.

BMC Microbiol. 2010 Apr 29;10:132. doi: 10.1186/1471-2180-10-132.

Using a label-free proteomics method to identify differentially abundant proteins in closely related hypo- and hypervirulent clinical Mycobacterium tuberculosis Beijing isolates.

Mol Cell Proteomics. 2010 Nov;9(11):2414-23. doi: 10.1074/mcp.M900422-MCP200. Epub 2010 Feb 26.

MSMSpdbb: providing protein databases of closely related organisms to improve proteomic characterization of prokaryotic microbes.

Bioinformatics. 2010 Mar 1;26(5):698-9. doi: 10.1093/bioinformatics/btq004. Epub 2010 Jan 14.

Genomic diversity among drug sensitive and multidrug resistant isolates of Mycobacterium tuberculosis with identical DNA fingerprints.

PLoS One. 2009 Oct 12;4(10):e7407. doi: 10.1371/journal.pone.0007407.

Evidence for a rapid rate of molecular evolution at the hypervariable and immunogenic Mycobacterium tuberculosis PPE38 gene region.

BMC Evol Biol. 2009 Sep 21;9:237. doi: 10.1186/1471-2148-9-237.

Validating divergent ORF annotation of the Mycobacterium leprae genome through a full translation data set and peptide identification by tandem mass spectrometry.

Proteomics. 2009 Jun;9(12):3233-43. doi: 10.1002/pmic.200800955.

A comprehensive survey of single nucleotide polymorphisms (SNPs) across Mycobacterium bovis strains and M. bovis BCG vaccine strains refines the genealogy and defines a minimal set of SNPs that separate virulent M. bovis strains and M. bovis BCG strains.

Infect Immun. 2009 May;77(5):2230-8. doi: 10.1128/IAI.01099-08. Epub 2009 Mar 16.

Whole genome sequence analysis of Mycobacterium bovis bacillus Calmette-Guérin (BCG) Tokyo 172: a comparative study of BCG vaccine substrains.

Vaccine. 2009 Mar 10;27(11):1710-6. doi: 10.1016/j.vaccine.2009.01.034. Epub 2009 Feb 4.

High functional diversity in Mycobacterium tuberculosis driven by genetic drift and human demography.

PLoS Biol. 2008 Dec 16;6(12):e311. doi: 10.1371/journal.pbio.0060311.

MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification.

Nat Biotechnol. 2008 Dec;26(12):1367-72. doi: 10.1038/nbt.1511. Epub 2008 Nov 30.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用聚类质谱友好型数据库对原核生物中的多态性和基因注释差异进行蛋白质基因组分析。

Proteogenomic analysis of polymorphisms and gene annotation divergences in prokaryotes using a clustered mass spectrometry-friendly database.

机构信息

The Gade Institute, Section for Microbiology and Immunology, University of Bergen, N-5021 Bergen, Norway.

出版信息

Mol Cell Proteomics. 2011 Jan;10(1):M110.002527. doi: 10.1074/mcp.M110.002527. Epub 2010 Oct 28.

DOI:10.1074/mcp.M110.002527

PMID:21030493

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3013451/

Abstract

摘要

使用聚类质谱友好型数据库对原核生物中的多态性和基因注释差异进行蛋白质基因组分析。

Proteogenomic analysis of polymorphisms and gene annotation divergences in prokaryotes using a clustered mass spectrometry-friendly database.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

使用聚类质谱友好型数据库对原核生物中的多态性和基因注释差异进行蛋白质基因组分析。

Proteogenomic analysis of polymorphisms and gene annotation divergences in prokaryotes using a clustered mass spectrometry-friendly database.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献