Department of Analytical Chemistry and Reference Materials, Organic Trace Analysis and Food Analysis, Bundesanstalt für Materialforschung und -prüfung (BAM), Berlin, Germany.
eScience, Bundesanstalt für Materialprüfung und -forschung, Berlin, Germany.
Rapid Commun Mass Spectrom. 2024 Oct 30;38(20):e9876. doi: 10.1002/rcm.9876.
Non-targeted screenings (NTS) are essential tools in different fields, such as forensics, health and environmental sciences. NTSs often employ mass spectrometry (MS) methods due to their high throughput and sensitivity in comparison to, for example, nuclear magnetic resonance-based methods. As the identification of mass spectral signals, called annotation, is labour intensive, it has been used for developing supporting tools based on machine learning (ML). However, both the diversity of mass spectral signals and the sheer quantity of different ML tools developed for compound annotation present a challenge for researchers in maintaining a comprehensive overview of the field. In this work, we illustrate which ML-based methods are available for compound annotation in non-targeted MS experiments and provide a nuanced comparison of the ML models used in MS data analysis, unravelling their unique features and performance metrics. Through this overview we support researchers to judiciously apply these tools in their daily research. This review also offers a detailed exploration of methods and datasets to show gaps in current methods, and promising target areas, offering a starting point for developers intending to improve existing methodologies.
非靶向筛查(NTS)是法医学、健康和环境科学等不同领域的重要工具。与基于核磁共振的方法相比,NTS 通常采用质谱(MS)方法,因为其具有高通量和高灵敏度的特点。由于对质谱信号进行识别(称为注释)需要大量的人工劳动,因此已经开发了基于机器学习(ML)的支持工具。然而,由于质谱信号的多样性以及为化合物注释开发的不同 ML 工具的数量庞大,研究人员在保持对该领域的全面了解方面面临挑战。在这项工作中,我们说明了哪些基于 ML 的方法可用于非靶向 MS 实验中的化合物注释,并对用于 MS 数据分析的 ML 模型进行了细致的比较,揭示了它们的独特特征和性能指标。通过这种概述,我们支持研究人员在日常研究中明智地应用这些工具。本综述还详细探讨了方法和数据集,以展示当前方法中的差距和有前途的目标领域,为希望改进现有方法的开发人员提供了一个起点。