Dablanc Axel, Hennechart Solweig, Perez Amélie, Cabanac Guillaume, Guitton Yann, Paulhe Nils, Lyan Bernard, Jamin Emilien L, Giacomoni Franck, Marti Guillaume
Laboratoire de Recherche en Sciences Végétales, Metatoul-AgromiX Platform, Université de Toulouse, CNRS, INP, 24 Chemin de Borde Rouge, Auzeville, Auzeville-Tolosane 31320, France.
MetaboHUB-MetaToul, National Infrastructure of Metabolomics and Fluxomics, Toulouse 31000, France.
Anal Chem. 2024 Jul 19;96(30):12489-96. doi: 10.1021/acs.analchem.4c02219.
Open mass spectral libraries (OMSLs) are critical for metabolite annotation and machine learning, especially given the rising volume of untargeted metabolomic studies and the development of annotation pipelines. Despite their importance, the practical application of OMSLs is hampered by the lack of standardized file formats, metadata fields, and supporting ontology. Current libraries, often restricted to specific topics or matrices, such as natural products, lipids, or the human metabolome, may limit the discovery potential of untargeted studies. The goal of FragHub is to provide users with the capability to integrate various OMSLs into a single unified format, thereby enhancing the annotation accuracy and reliability. FragHub addresses these challenges by integrating multiple OMSLs into a single comprehensive database, supporting various data formats, and harmonizing metadata. It also proposes some generic filters for the mass spectrum using a graphical user interface. Additionally, a workflow to generate in-house libraries compatible with FragHub is proposed. FragHub dynamically segregates libraries based on ionization modes and chromatography techniques, thereby enhancing data utility in metabolomic research. The FragHub Python code is publicly available under a MIT license, at the following repository: https://github.com/eMetaboHUB/FragHub. Generated data can be accessed at 10.5281/zenodo.11057687.
开放质谱库(OMSLs)对于代谢物注释和机器学习至关重要,特别是考虑到非靶向代谢组学研究数量的不断增加以及注释流程的发展。尽管它们很重要,但OMSLs的实际应用受到缺乏标准化文件格式、元数据字段和支持本体的阻碍。当前的库通常局限于特定主题或基质,如天然产物、脂质或人类代谢组,这可能会限制非靶向研究的发现潜力。FragHub的目标是为用户提供将各种OMSLs整合为单一统一格式的能力,从而提高注释的准确性和可靠性。FragHub通过将多个OMSLs整合到一个综合数据库、支持各种数据格式并协调元数据来应对这些挑战。它还使用图形用户界面为质谱提出了一些通用过滤器。此外,还提出了一个生成与FragHub兼容的内部库的工作流程。FragHub根据电离模式和色谱技术动态分离库,从而提高代谢组学研究中的数据实用性。FragHub Python代码在MIT许可下公开可用,可在以下存储库中获取:https://github.com/eMetaboHUB/FragHub。生成的数据可在10.5281/zenodo.11057687访问。