Department of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, SE-106 91, Stockholm, Sweden.
Institute of Biodiversity, Faculty of Biological Science, Cluster of Excellence Balance of the Microverse, Friedrich-Schiller-University Jena, 07743, Jena, Germany.
Environ Sci Technol. 2024 Oct 1;58(39):17406-17418. doi: 10.1021/acs.est.4c02833. Epub 2024 Sep 19.
The machine-learning tool MS2Tox can prioritize hazardous nontargeted molecular features in environmental waters, by predicting acute fish lethality of unknown molecules based on their MS spectra, prior to structural annotation. It has yet to be investigated how the extent of molecular coverage, MS spectra quality, and toxicity prediction confidence depend on sample complexity and MS data acquisition strategies. We compared two common nontargeted MS acquisition strategies with liquid chromatography high-resolution mass spectrometry for structural annotation accuracy by SIRIUS+CSI:FingerID and MS2Tox toxicity prediction of 191 reference chemicals spiked to LC-MS water, groundwater, surface water, and wastewater. Data-dependent acquisition (DDA) resulted in higher rates (19-62%) of correct structural annotations among reference chemicals in all matrices except wastewaters, compared to data-independent acquisition (DIA, 19-50%). However, DIA resulted in higher MS detection rates (59-84% DIA, 37-82% DDA), leading to higher true positive rates for spectral library matching, 40-73% compared to 34-72%. DDA resulted in higher MS2Tox toxicity prediction accuracy than DIA, with root-mean-square errors of 0.62 and 0.71 log-mM, respectively. Given the importance of MS spectral quality, we introduce a "CombinedConfidence" score to convey relative confidence in MS2Tox predictions and apply this approach to prioritize potentially ecotoxic nontargeted features in environmental waters.
机器学习工具 MS2Tox 可以根据 MS 光谱预测未知分子对鱼类的急性致死毒性,从而对环境水中的有害非靶向分子特征进行优先级排序,而无需进行结构注释。目前尚未研究分子覆盖率、MS 光谱质量和毒性预测置信度如何取决于样品复杂性和 MS 数据采集策略。我们比较了两种常见的非靶向 MS 采集策略与 LC-MS 水质、地下水、地表水和废水的 SIRIUS+CSI:FingerID 结构注释准确性和 MS2Tox 毒性预测的 191 种参考化学品的关系。与数据独立采集(DIA,19-50%)相比,在所有基质中,除了废水外,数据依赖采集(DDA)在所有基质中都能提高参考化学品的正确结构注释率(19-62%)。然而,DIA 导致更高的 MS 检测率(59-84% DIA,37-82% DDA),从而导致光谱库匹配的真实阳性率更高,为 40-73%,而 DDA 为 34-72%。DDA 比 DIA 具有更高的 MS2Tox 毒性预测准确性,其均方根误差分别为 0.62 和 0.71 log-mM。鉴于 MS 光谱质量的重要性,我们引入了“综合置信度”评分来传达 MS2Tox 预测的相对置信度,并应用该方法对环境水中潜在的生态毒性非靶向特征进行优先级排序。