Suppr超能文献

基于预测毒性类别对未知液相色谱 - 高分辨质谱特征进行优先级排序。

Prioritization of Unknown LC-HRMS Features Based on Predicted Toxicity Categories.

作者信息

Turkina Viktoriia, Gringhuis Jelle T, Boot Sanne, Petrignani Annemieke, Corthals Garry, Praetorius Antonia, O'Brien Jake W, Samanipour Saer

机构信息

Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, Netherlands.

Institute for Biodiversity and Ecosystem Dynamics (IBED), University of Amsterdam, 1090 GE, Amsterdam, Netherlands.

出版信息

Environ Sci Technol. 2025 Apr 29;59(16):8004-8015. doi: 10.1021/acs.est.4c13026. Epub 2025 Apr 20.

Abstract

Complex environmental samples contain a diverse array of known and unknown constituents. While liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS) nontargeted analysis (NTA) has emerged as an essential tool for the comprehensive study of such samples, the identification of individual constituents remains a significant challenge, primarily due to the vast number of detected features in each sample. To address this, prioritization strategies are frequently employed to narrow the focus to the most relevant features for further analysis. In this study, we developed a novel prioritization strategy that directly links fragmentation and chromatographic data to aquatic toxicity categories, bypassing the need for identification of individual compounds. Given that features are not always well-characterized through fragmentation, we created two models: (1) a Random Forest Classification (RFC) model, which classifies fish toxicity categories based on MS1, retention, and fragmentation data─expressed as cumulative neutral losses (CNLs)─when fragmentation information is available, and (2) a Kernel Density Estimation (KDE) model that relies solely on retention time and MS1 data when fragmentation is absent. Both models demonstrated accuracy comparable to that of structure-based prediction methods. We further tested the models on a pesticide mixture in a tea extract measured by LC-HRMS, where the CNL-based RFC model achieved 0.76 accuracy and the KDE model reached 0.61, showcasing their robust performance in real-world applications.

摘要

复杂的环境样本包含各种各样已知和未知的成分。虽然液相色谱与高分辨率质谱联用(LC-HRMS)的非靶向分析(NTA)已成为对此类样本进行全面研究的重要工具,但识别单个成分仍然是一项重大挑战,主要原因是每个样本中检测到的特征数量众多。为了解决这个问题,通常采用优先级排序策略,将重点缩小到最相关的特征以进行进一步分析。在本研究中,我们开发了一种新颖的优先级排序策略,该策略直接将碎片和色谱数据与水生毒性类别联系起来,无需识别单个化合物。鉴于通过碎片分析并非总能很好地表征特征,我们创建了两个模型:(1)随机森林分类(RFC)模型,当有碎片信息时,该模型根据MS1、保留时间和碎片数据(表示为累积中性损失(CNL))对鱼类毒性类别进行分类;(2)核密度估计(KDE)模型,当没有碎片时,该模型仅依赖保留时间和MS1数据。两个模型都显示出与基于结构的预测方法相当的准确性。我们进一步在通过LC-HRMS测量的茶提取物中的农药混合物上测试了这些模型,其中基于CNL的RFC模型的准确率达到0.76,KDE模型达到0.61,展示了它们在实际应用中的强大性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e7f2/12044687/1e8f9a7b6727/es4c13026_0001.jpg

相似文献

1
Prioritization of Unknown LC-HRMS Features Based on Predicted Toxicity Categories.
Environ Sci Technol. 2025 Apr 29;59(16):8004-8015. doi: 10.1021/acs.est.4c13026. Epub 2025 Apr 20.
2
MS2Tox Machine Learning Tool for Predicting the Ecotoxicity of Unidentified Chemicals in Water by Nontarget LC-HRMS.
Environ Sci Technol. 2022 Nov 15;56(22):15508-15517. doi: 10.1021/acs.est.2c02536. Epub 2022 Oct 21.
3
Evaluation of Nontarget Long-Term LC-HRMS Time Series Data Using Multivariate Statistical Approaches.
Anal Chem. 2020 Sep 15;92(18):12273-12281. doi: 10.1021/acs.analchem.0c01897. Epub 2020 Sep 2.
5
Prioritizing potential endocrine active high resolution mass spectrometry (HRMS) features in Minnesota lakewater.
Sci Total Environ. 2019 Jun 20;670:814-825. doi: 10.1016/j.scitotenv.2019.02.448. Epub 2019 Mar 8.
6
A Refined Nontarget Workflow for the Investigation of Metabolites through the Prioritization by in Silico Prediction Tools.
Anal Chem. 2019 May 7;91(9):6321-6328. doi: 10.1021/acs.analchem.9b01218. Epub 2019 Apr 22.
7
Machine Learning-based Classification for the Prioritization of Potentially Hazardous Chemicals with Structural Alerts in Nontarget Screening.
Environ Sci Technol. 2025 Mar 18;59(10):5056-5065. doi: 10.1021/acs.est.4c10498. Epub 2025 Mar 7.
8
Evaluation of Nontargeted Mass Spectral Data Acquisition Strategies for Water Analysis and Toxicity-Based Feature Prioritization by MS2Tox.
Environ Sci Technol. 2024 Oct 1;58(39):17406-17418. doi: 10.1021/acs.est.4c02833. Epub 2024 Sep 19.

本文引用的文献

2
Neurotoxic mixture effects of chemicals extracted from blood of pregnant women.
Science. 2024 Oct 18;386(6719):301-309. doi: 10.1126/science.adq0336. Epub 2024 Oct 17.
3
Evaluation of Nontargeted Mass Spectral Data Acquisition Strategies for Water Analysis and Toxicity-Based Feature Prioritization by MS2Tox.
Environ Sci Technol. 2024 Oct 1;58(39):17406-17418. doi: 10.1021/acs.est.4c02833. Epub 2024 Sep 19.
7
Exploring the Chemical Space of the Exposome: How Far Have We Gone?
JACS Au. 2024 Jun 20;4(7):2412-2425. doi: 10.1021/jacsau.4c00220. eCollection 2024 Jul 22.
8
Exploring the chemical subspace of RPLC: A data driven approach.
Anal Chim Acta. 2024 Aug 15;1317:342869. doi: 10.1016/j.aca.2024.342869. Epub 2024 Jun 20.
9
Water Analysis: Emerging Contaminants and Current Issues.
Anal Chem. 2024 May 21;96(20):8184-8219. doi: 10.1021/acs.analchem.4c01423. Epub 2024 May 3.
10
Predicting the Activity of Unidentified Chemicals in Complementary Bioassays from the HRMS Data to Pinpoint Potential Endocrine Disruptors.
J Chem Inf Model. 2024 Apr 22;64(8):3093-3104. doi: 10.1021/acs.jcim.3c02050. Epub 2024 Mar 24.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验