Program for Bioinformatics, Boston University, Boston, Massachusetts 02215, United States.
Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, 999077 Hong Kong, P. R. China.
Anal Chem. 2024 Nov 19;96(46):18491-18501. doi: 10.1021/acs.analchem.4c04091. Epub 2024 Nov 8.
Mass spectral libraries are collections of reference spectra, usually associated with specific analytes from which the spectra were generated, that are used for further downstream analysis of new spectra. There are many different formats used for encoding spectral libraries, but none have undergone a standardization process to ensure broad applicability to many applications. As part of the Human Proteome Organization Proteomics Standards Initiative (PSI), we have developed a standardized format for encoding spectral libraries, called mzSpecLib (https://psidev.info/mzSpecLib). It is primarily a data model that flexibly encodes metadata about the library entries using the extensible PSI-MS controlled vocabulary and can be encoded in and converted between different serialization formats. We have also developed a standardized data model and serialization for fragment ion peak annotations, called mzPAF (https://psidev.info/mzPAF). It is defined as a separate standard, since it may be used for other applications besides spectral libraries. The mzSpecLib and mzPAF standards are compatible with existing PSI standards such as ProForma 2.0 and the Universal Spectrum Identifier. The mzSpecLib and mzPAF standards have been primarily defined for peptides in proteomics applications with basic small molecule support. They could be extended in the future to other fields that need to encode spectral libraries for nonpeptidic analytes.
质谱谱库是参考谱的集合,通常与生成这些谱的特定分析物相关联,用于对新谱进行进一步的下游分析。有许多不同的格式用于编码光谱库,但没有一种经过标准化处理,以确保广泛适用于许多应用。作为人类蛋白质组组织蛋白质组学标准倡议(PSI)的一部分,我们开发了一种用于编码光谱库的标准化格式,称为 mzSpecLib(https://psidev.info/mzSpecLib)。它主要是一个数据模型,使用可扩展的 PSI-MS 控制词汇表灵活地编码有关库条目的元数据,并可以在不同的序列化格式中进行编码和转换。我们还开发了一种用于片段离子峰注释的标准化数据模型和序列化,称为 mzPAF(https://psidev.info/mzPAF)。它被定义为一个单独的标准,因为它可能用于光谱库以外的其他应用。mzSpecLib 和 mzPAF 标准与现有的 PSI 标准(如 ProForma 2.0 和通用光谱标识符)兼容。mzSpecLib 和 mzPAF 标准主要针对蛋白质组学应用中的肽定义,具有基本的小分子支持。它们可以在未来扩展到其他需要为非肽类分析物编码光谱库的领域。