Suppr超能文献

三种液相色谱(LC)保留时间预测模型的比较。

A comparison of three liquid chromatography (LC) retention time prediction models.

作者信息

McEachran Andrew D, Mansouri Kamel, Newton Seth R, Beverly Brandiese E J, Sobus Jon R, Williams Antony J

机构信息

Oak Ridge Institute for Science and Education (ORISE) Research Participation Program, US Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA; National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA.

National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA.

出版信息

Talanta. 2018 May 15;182:371-379. doi: 10.1016/j.talanta.2018.01.022. Epub 2018 Jan 11.

Abstract

High-resolution mass spectrometry (HRMS) data has revolutionized the identification of environmental contaminants through non-targeted analysis (NTA). However, chemical identification remains challenging due to the vast number of unknown molecular features typically observed in environmental samples. Advanced data processing techniques are required to improve chemical identification workflows. The ideal workflow brings together a variety of data and tools to increase the certainty of identification. One such tool is chromatographic retention time (RT) prediction, which can be used to reduce the number of possible suspect chemicals within an observed RT window. This paper compares the relative predictive ability and applicability to NTA workflows of three RT prediction models: (1) a logP (octanol-water partition coefficient)-based model using EPI Suite™ logP predictions; (2) a commercially available ACD/ChromGenius model; and, (3) a newly developed Quantitative Structure Retention Relationship model called OPERA-RT. Models were developed using the same training set of 78 compounds with experimental RT data and evaluated for external predictivity on an identical test set of 19 compounds. Both the ACD/ChromGenius and OPERA-RT models outperformed the EPI Suite™ logP-based RT model (R = 0.81-0.92, 0.86-0.83, 0.66-0.69 for training-test sets, respectively). Further, both OPERA-RT and ACD/ChromGenius predicted 95% of RTs within a ± 15% chromatographic time window of experimental RTs. Based on these results, we simulated an NTA workflow with a ten-fold larger list of candidate structures generated for formulae of the known test set chemicals using the U.S. EPA's CompTox Chemistry Dashboard (https://comptox.epa.gov/dashboard), RTs for all candidates were predicted using both ACD/ChromGenius and OPERA-RT, and RT screening windows were assessed for their ability to filter out unlikely candidate chemicals and enhance potential identification. Compared to ACD/ChromGenius, OPERA-RT screened out a greater percentage of candidate structures within a 3-min RT window (60% vs. 40%) but retained fewer of the known chemicals (42% vs. 83%). By several metrics, the OPERA-RT model, generated as a proof-of-concept using a limited set of open source data, performed as well as the commercial tool ACD/ChromGenius when constrained to the same small training and test sets. As the availability of RT data increases, we expect the OPERA-RT model's predictive ability will increase.

摘要

高分辨率质谱(HRMS)数据通过非靶向分析(NTA)彻底改变了环境污染物的识别方式。然而,由于在环境样品中通常会观察到大量未知的分子特征,化学物质的识别仍然具有挑战性。需要先进的数据处理技术来改进化学物质识别工作流程。理想的工作流程整合了各种数据和工具,以提高识别的确定性。其中一种工具是色谱保留时间(RT)预测,它可用于减少在观察到的RT窗口内可能的可疑化学物质数量。本文比较了三种RT预测模型对NTA工作流程的相对预测能力和适用性:(1)基于EPI Suite™logP预测的基于logP(正辛醇-水分配系数)的模型;(2)市售的ACD/ChromGenius模型;以及(3)一种新开发的名为OPERA-RT的定量结构保留关系模型。使用包含78种化合物的相同训练集及其实验RT数据开发模型,并在包含19种化合物的相同测试集上评估其外部预测能力。ACD/ChromGenius模型和OPERA-RT模型均优于基于EPI Suite™logP的RT模型(训练集-测试集的R分别为0.81-0.92、0.86-0.83、0.66-0.69)。此外,OPERA-RT和ACD/ChromGenius均在实验RT的±15%色谱时间窗口内预测了95%的RT。基于这些结果,我们模拟了一个NTA工作流程,使用美国环境保护局的CompTox化学仪表盘(https://comptox.epa.gov/dashboard)为已知测试集化学品的分子式生成了十倍大的候选结构列表,使用ACD/ChromGenius和OPERA-RT预测了所有候选物的RT,并评估了RT筛选窗口过滤掉不太可能的候选化学物质和增强潜在识别的能力。与ACD/ChromGenius相比,OPERA-RT在3分钟的RT窗口内筛选出了更大比例的候选结构(60%对40%),但保留的已知化学物质较少(42%对83%)。通过多项指标衡量,使用有限的一组开源数据作为概念验证生成的OPERA-RT模型,在受限于相同的小训练集和测试集时,其表现与商业工具ACD/ChromGenius相当。随着RT数据可用性的增加,我们预计OPERA-RT模型的预测能力将会提高。

相似文献

10
RT-Pred: A web server for accurate, customized liquid chromatography retention time prediction of chemicals.
J Chromatogr A. 2025 Apr 26;1747:465816. doi: 10.1016/j.chroma.2025.465816. Epub 2025 Feb 25.

引用本文的文献

3
Exploring the Chemical Space of the Exposome: How Far Have We Gone?探索暴露组的化学空间:我们已经走了多远?
JACS Au. 2024 Jun 20;4(7):2412-2425. doi: 10.1021/jacsau.4c00220. eCollection 2024 Jul 22.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验