Griss Johannes, Jones Andrew R, Sachsenberg Timo, Walzer Mathias, Gatto Laurent, Hartler Jürgen, Thallinger Gerhard G, Salek Reza M, Steinbeck Christoph, Neuhauser Nadin, Cox Jürgen, Neumann Steffen, Fan Jun, Reisinger Florian, Xu Qing-Wei, Del Toro Noemi, Pérez-Riverol Yasset, Ghali Fawaz, Bandeira Nuno, Xenarios Ioannis, Kohlbacher Oliver, Vizcaíno Juan Antonio, Hermjakob Henning
From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK; §Division of Immunology, Allergy and Infectious Diseases, Department of Dermatology, Medical University of Vienna, Vienna, Austria;
‖Institute of Integrative Biology, University of Liverpool, L69 7ZB, Liverpool, UK;
Mol Cell Proteomics. 2014 Oct;13(10):2765-75. doi: 10.1074/mcp.O113.036681. Epub 2014 Jun 30.
The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R. We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. mzTab is intended as a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files can contain protein, peptide, and small molecule identifications together with experimental metadata and basic quantitative information. The format is not intended to store the complete experimental evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the experimental design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biological community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive additional documentation can be found online.
人类蛋白质组组织(HUPO)蛋白质组学标准计划已经开发了几种标准化数据格式,以促进基于质谱(MS)的蛋白质组学中的数据共享。这些格式使研究人员能够以统一的方式报告他们的完整结果。然而,目前尚无一种格式能够以简单的表格形式描述蛋白质组学和代谢组学实验的最终定性和定量结果。许多下游分析用例只关注实验的最终结果,并且需要一种易于访问的格式,与诸如Microsoft Excel或R等工具兼容。为满足这一需求,我们开发了用于基于MS的蛋白质组学和代谢组学结果的mzTab文件格式。mzTab旨在作为现有基于XML的标准文件格式(mzML、mzIdentML、mzQuantML)的轻量级补充,提供全面的总结,其概念类似于科学出版物的补充材料。mzTab文件可以包含蛋白质、肽和小分子鉴定结果以及实验元数据和基本定量信息。该格式并非用于存储完整的实验证据,而是提供了在不同详细程度上报告结果的机制。这些机制涵盖从最终结果的简单总结到包括实验设计的结果表示。这种格式非常适合将基于MS的蛋白质组学和代谢组学结果提供给MS领域之外更广泛的生物学界。几种蛋白质组学和代谢组学软件工具已经采用该格式作为输出格式。可以在网上找到全面的mzTab规范文档和大量其他文档。