Ursgal，用于大规模分析的整合常见自下而上蛋白质组学工具的通用Python模块。

Ursgal, Universal Python Module Combining Common Bottom-Up Proteomics Tools for Large-Scale Analysis.

作者信息

Kremer Lukas P M, Leufken Johannes, Oyunchimeg Purevdulam, Schulze Stefan, Fufezan Christian

机构信息

Institute of Plant Biology and Biotechnology, University of Muenster , Schlossplatz 8, 48143 Münster, Germany.

出版信息

J Proteome Res. 2016 Mar 4;15(3):788-94. doi: 10.1021/acs.jproteome.5b00860. Epub 2016 Jan 13.

DOI:10.1021/acs.jproteome.5b00860

PMID:26709623

Abstract

Proteomics data integration has become a broad field with a variety of programs offering innovative algorithms to analyze increasing amounts of data. Unfortunately, this software diversity leads to many problems as soon as the data is analyzed using more than one algorithm for the same task. Although it was shown that the combination of multiple peptide identification algorithms yields more robust results, it is only recently that unified approaches are emerging; however, workflows that, for example, aim to optimize search parameters or that employ cascaded style searches can only be made accessible if data analysis becomes not only unified but also and most importantly scriptable. Here we introduce Ursgal, a Python interface to many commonly used bottom-up proteomics tools and to additional auxiliary programs. Complex workflows can thus be composed using the Python scripting language using a few lines of code. Ursgal is easily extensible, and we have made several database search engines (X!Tandem, OMSSA, MS-GF+, Myrimatch, MS Amanda), statistical postprocessing algorithms (qvality, Percolator), and one algorithm that combines statistically postprocessed outputs from multiple search engines ("combined FDR") accessible as an interface in Python. Furthermore, we have implemented a new algorithm ("combined PEP") that combines multiple search engines employing elements of "combined FDR", PeptideShaker, and Bayes' theorem.

摘要

蛋白质组学数据整合已成为一个广泛的领域，有各种各样的程序提供创新算法来分析越来越多的数据。不幸的是，一旦针对同一任务使用多种算法分析数据，这种软件的多样性就会导致许多问题。尽管已经表明多种肽段鉴定算法的组合能产生更可靠的结果，但直到最近才出现统一的方法；然而，例如旨在优化搜索参数或采用级联式搜索的工作流程，只有在数据分析不仅实现统一而且最重要的是可编写脚本的情况下才能使用。在这里，我们介绍Ursgal，它是一个Python接口，可连接许多常用的自下而上蛋白质组学工具以及其他辅助程序。因此，可以使用Python脚本语言通过几行代码来构建复杂的工作流程。Ursgal易于扩展，我们已经使几个数据库搜索引擎（X!Tandem、OMSSA、MS-GF+、Myrimatch、MS Amanda）、统计后处理算法（qvality、Percolator）以及一种结合多个搜索引擎统计后处理输出的算法（“组合错误发现率”）在Python中作为接口可用。此外，我们还实现了一种新算法（“组合肽段假阳性率”），该算法结合了多个搜索引擎，采用了“组合错误发现率”、PeptideShaker和贝叶斯定理的元素。

相似文献

Ursgal, Universal Python Module Combining Common Bottom-Up Proteomics Tools for Large-Scale Analysis.

J Proteome Res. 2016 Mar 4;15(3):788-94. doi: 10.1021/acs.jproteome.5b00860. Epub 2016 Jan 13.

Optimization of Search Engines and Postprocessing Approaches to Maximize Peptide and Protein Identification for High-Resolution Mass Data.

J Proteome Res. 2015 Nov 6;14(11):4662-73. doi: 10.1021/acs.jproteome.5b00536. Epub 2015 Sep 30.

Enhancing Open Modification Searches via a Combined Approach Facilitated by Ursgal.

J Proteome Res. 2021 Apr 2;20(4):1986-1996. doi: 10.1021/acs.jproteome.0c00799. Epub 2021 Jan 29.

In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics.

J Proteomics. 2017 Jan 6;150:170-182. doi: 10.1016/j.jprot.2016.08.002. Epub 2016 Aug 4.

IPeak: An open source tool to combine results from multiple MS/MS search engines.

Proteomics. 2015 Sep;15(17):2916-20. doi: 10.1002/pmic.201400208. Epub 2015 Aug 6.

IdentiPy: An Extensible Search Engine for Protein Identification in Shotgun Proteomics.

J Proteome Res. 2018 Jul 6;17(7):2249-2255. doi: 10.1021/acs.jproteome.7b00640. Epub 2018 Jun 18.

Scavager: A Versatile Postsearch Validation Algorithm for Shotgun Proteomics Based on Gradient Boosting.

Proteomics. 2019 Feb;19(3):e1800280. doi: 10.1002/pmic.201800280. Epub 2018 Dec 27.

PepArML: A Meta-Search Peptide Identification Platform for Tandem Mass Spectra.

Curr Protoc Bioinformatics. 2013 Dec;44(1323):13.23.1-23. doi: 10.1002/0471250953.bi1323s44.

Comparative database search engine analysis on massive tandem mass spectra of pork-based food products for halal proteomics.

J Proteomics. 2021 Jun 15;241:104240. doi: 10.1016/j.jprot.2021.104240. Epub 2021 Apr 21.

Algorithms for database-dependent search of MS/MS data.

Methods Mol Biol. 2013;1007:119-38. doi: 10.1007/978-1-62703-392-3_5.

引用本文的文献

Quorum sensing mediates morphology and motility transitions in the model archaeon .

mBio. 2025 Jun 18:e0090625. doi: 10.1128/mbio.00906-25.

Quorum sensing mediates morphology and motility transitions in the model archaeon .

bioRxiv. 2025 Jan 14:2025.01.14.633064. doi: 10.1101/2025.01.14.633064.

Identification of structural and regulatory cell-shape determinants in Haloferax volcanii.

Nat Commun. 2024 Feb 15;15(1):1414. doi: 10.1038/s41467-024-45196-0.

Multienzyme deep learning models improve peptide de novo sequencing by mass spectrometry proteomics.

PLoS Comput Biol. 2023 Jan 20;19(1):e1010457. doi: 10.1371/journal.pcbi.1010457. eCollection 2023 Jan.

Proteomic Sample Preparation and Data Analysis in Line with the Archaeal Proteome Project.

Methods Mol Biol. 2022;2522:287-300. doi: 10.1007/978-1-0716-2445-6_18.

Comprehensive glycoproteomics shines new light on the complexity and extent of glycosylation in archaea.

PLoS Biol. 2021 Jun 17;19(6):e3001277. doi: 10.1371/journal.pbio.3001277. eCollection 2021 Jun.

Cerebrospinal fluid proteome maps detect pathogen-specific host response patterns in meningitis.

Elife. 2021 Apr 6;10:e64159. doi: 10.7554/eLife.64159.

SMITER-A Python Library for the Simulation of LC-MS/MS Experiments.

Genes (Basel). 2021 Mar 11;12(3):396. doi: 10.3390/genes12030396.

Enhancing Open Modification Searches via a Combined Approach Facilitated by Ursgal.

J Proteome Res. 2021 Apr 2;20(4):1986-1996. doi: 10.1021/acs.jproteome.0c00799. Epub 2021 Jan 29.

Altered -glycan composition impacts flagella-mediated adhesion in .

Elife. 2020 Dec 10;9:e58805. doi: 10.7554/eLife.58805.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Ursgal，用于大规模分析的整合常见自下而上蛋白质组学工具的通用Python模块。

Ursgal, Universal Python Module Combining Common Bottom-Up Proteomics Tools for Large-Scale Analysis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献