Suppr超能文献

使用定量结构-保留关系预测保留指数,以改进非靶向代谢组学中的结构鉴定。

Retention Index Prediction Using Quantitative Structure-Retention Relationships for Improving Structure Identification in Nontargeted Metabolomics.

机构信息

Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry , University of Tasmania , Private Bag 75 , Hobart , 7001 Tasmania , Australia.

Pfizer Global Research and Development , Sandwich CT139NJ , U.K.

出版信息

Anal Chem. 2018 Aug 7;90(15):9434-9440. doi: 10.1021/acs.analchem.8b02084. Epub 2018 Jul 10.

Abstract

Structure identification in nontargeted metabolomics based on liquid-chromatography coupled to mass spectrometry (LC-MS) remains a significant challenge. Quantitative structure-retention relationship (QSRR) modeling is a technique capable of accelerating the structure identification of metabolites by predicting their retention, allowing false positives to be eliminated during the interpretation of metabolomics data. In this work, 191 compounds were grouped according to molecular weight and a QSRR study was carried out on the 34 resulting groups to eliminate false positives. Partial least squares (PLS) regression combined with a Genetic algorithm (GA) was applied to construct the linear QSRR models based on a variety of VolSurf+ molecular descriptors. A novel dual-filtering approach, which combines Tanimoto similarity (TS) searching as the primary filter and retention index (RI) similarity clustering as the secondary filter, was utilized to select compounds in training sets to derive the QSRR models yielding R of 0.8512 and an average root mean square error in prediction (RMSEP) of 8.45%. With a retention index filter expressed as ±2 standard deviations (SD) of the error, representative compounds were predicted with >91% accuracy, and for 53% of the groups (18/34), at least one false positive compound could be eliminated. The proposed strategy can thus narrow down the number of false positives to be assessed in nontargeted metabolomics.

摘要

基于液相色谱与质谱联用(LC-MS)的非靶向代谢组学中的结构鉴定仍然是一个重大挑战。定量构效关系(QSRR)建模是一种能够通过预测代谢物的保留时间来加速代谢物结构鉴定的技术,从而在代谢组学数据分析解释过程中排除假阳性。在这项工作中,根据分子量将 191 种化合物进行分组,并对 34 个分组进行 QSRR 研究,以消除假阳性。偏最小二乘(PLS)回归结合遗传算法(GA)被应用于构建基于多种 VolSurf+分子描述符的线性 QSRR 模型。一种新的双过滤方法,将相似度搜索(TS)作为主要过滤器和保留指数(RI)相似性聚类作为次要过滤器相结合,用于选择训练集中的化合物以获得 QSRR 模型,其 R 为 0.8512,平均预测均方根误差(RMSEP)为 8.45%。对于保留指数过滤器,表达为误差的±2 个标准差(SD),代表性化合物的预测准确率>91%,对于 53%的组(18/34),至少可以消除一个假阳性化合物。因此,该策略可以减少非靶向代谢组学中需要评估的假阳性数量。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验