Institute of Biomedical Informatics, Graz University of Technology, Stremayrgasse 16/I, 8010, Graz, Austria.
Omics Center Graz, BioTechMed-Graz, Stiftingtalstrasse 24, 8010, Graz, Austria.
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab073.
Metabolomics, the comprehensive study of the metabolome, and lipidomics-the large-scale study of pathways and networks of cellular lipids-are major driving forces in enabling personalized medicine. Complicated and error-prone data analysis still remains a bottleneck, however, especially for identifying novel metabolites. Comparing experimental mass spectra to curated databases containing reference spectra has been the gold standard for identification of compounds, but constructing such databases is a costly and time-demanding task. Many software applications try to circumvent this process by utilizing cutting-edge advances in computational methods-including quantum chemistry and machine learning-and simulate mass spectra by performing theoretical, so called in silico fragmentations of compounds. Other solutions concentrate directly on experimental spectra and try to identify structural properties by investigating reoccurring patterns and the relationships between them. The considerable progress made in the field allows recent approaches to provide valuable clues to expedite annotation of experimental mass spectra. This review sheds light on individual strengths and weaknesses of these tools, and attempts to evaluate them-especially in view of lipidomics, when considering complex mixtures found in biological samples as well as mass spectrometer inter-instrument variability.
代谢组学是对代谢组的全面研究,脂质组学是对细胞脂质途径和网络的大规模研究,是实现个性化医疗的主要驱动力。然而,复杂且容易出错的数据分析仍然是一个瓶颈,尤其是在识别新的代谢物方面。将实验质谱与包含参考光谱的已审核数据库进行比较一直是化合物鉴定的金标准,但构建此类数据库是一项昂贵且耗时的任务。许多软件应用程序试图通过利用计算方法的最新进展(包括量子化学和机器学习)来规避这一过程,并通过对化合物进行理论上的所谓“虚拟”碎片化来模拟质谱。其他解决方案则直接针对实验光谱,并通过研究重复出现的模式及其之间的关系来尝试识别结构特性。该领域取得的重大进展使得最近的方法能够为加速实验质谱的注释提供有价值的线索。本文综述了这些工具的优缺点,并尝试对它们进行评估——特别是在脂质组学方面,考虑到生物样本中存在的复杂混合物以及质谱仪之间的可变性。