Quantitative Biology Center and Department of Computer Science, Center for Bioinformatics, University of Tübingen, Sand 14, 72076 Tübingen, Germany.
Mol Cell Proteomics. 2013 Aug;12(8):2332-40. doi: 10.1074/mcp.O113.028506. Epub 2013 Apr 18.
The range of heterogeneous approaches available for quantifying protein abundance via mass spectrometry (MS)(1) leads to considerable challenges in modeling, archiving, exchanging, or submitting experimental data sets as supplemental material to journals. To date, there has been no widely accepted format for capturing the evidence trail of how quantitative analysis has been performed by software, for transferring data between software packages, or for submitting to public databases. In the context of the Proteomics Standards Initiative, we have developed the mzQuantML data standard. The standard can represent quantitative data about regions in two-dimensional retention time versus mass/charge space (called features), peptides, and proteins and protein groups (where there is ambiguity regarding peptide-to-protein inference), and it offers limited support for small molecule (metabolomic) data. The format has structures for representing replicate MS runs, grouping of replicates (for example, as study variables), and capturing the parameters used by software packages to arrive at these values. The format has the capability to reference other standards such as mzML and mzIdentML, and thus the evidence trail for the MS workflow as a whole can now be described. Several software implementations are available, and we encourage other bioinformatics groups to use mzQuantML as an input, internal, or output format for quantitative software and for structuring local repositories. All project resources are available in the public domain from the HUPO Proteomics Standards Initiative http://www.psidev.info/mzquantml.
通过质谱(MS)定量蛋白质丰度的异质方法范围广泛(1),这导致在建模、存档、交换或提交期刊补充材料的实验数据集方面存在相当大的挑战。迄今为止,还没有一种被广泛接受的格式来捕捉软件执行定量分析的证据线索,用于在软件包之间传输数据,或提交给公共数据库。在蛋白质组学标准倡议的背景下,我们开发了 mzQuantML 数据标准。该标准可以表示二维保留时间与质量/电荷空间(称为特征)、肽和蛋白质和蛋白质组(关于肽到蛋白质推断存在歧义)的定量数据,并为小分子(代谢组学)数据提供有限的支持。该格式具有表示重复 MS 运行、复制分组(例如,作为研究变量)以及捕获软件包用于得出这些值的参数的结构。该格式能够引用其他标准,如 mzML 和 mzIdentML,因此现在可以描述整个 MS 工作流程的证据线索。有几个软件实现可用,我们鼓励其他生物信息学小组将 mzQuantML 用作定量软件的输入、内部或输出格式,并用于构建本地存储库。所有项目资源都可以从 HUPO 蛋白质组学标准倡议的公共领域获得 http://www.psidev.info/mzquantml。