Suppr超能文献

非靶向分析的结构预测:从物理化学性质到分子结构。

Structure Predictions for Non-targeted Analysis: From Physicochemical Properties to Molecular Structures.

机构信息

Department of Obstetrics, Gynecology and Reproductive Sciences, Program on Reproductive Health and the Environment, University of California San Francisco, San Francisco, California 94143, United States.

Department of Civil and Environmental Engineering, University of California, Davis, One Shields Avenue, Davis, California 95616, United States.

出版信息

J Am Soc Mass Spectrom. 2022 Jul 6;33(7):1134-1147. doi: 10.1021/jasms.1c00386. Epub 2022 Jun 1.

Abstract

While important advances have been made in high-resolution mass spectrometry (HRMS) and its applications in non-targeted analysis (NTA), the number of identified compounds in biological and environmental samples often does not exceed 5% of the detected chemical features. Our aim was to develop a computational pipeline that leverages data from HRMS but also incorporates physicochemical properties (equilibrium partition ratios between organic solvents and water; ) and can propose molecular structures for detected chemical features. As these physicochemical properties are often sufficiently different across isomers, when put together, they can form a unique profile for each isomer, which we describe as the "physicochemical fingerprint". In our study, we used a comprehensive database of compounds that have been previously reported in human blood and collected their values for 129 partitioning systems. We used RDKit to calculate the number of RDKit fragments and the number of RDKit bits per molecule. We then developed and trained an artificial neural network, which used as an input the physicochemical fingerprint of a chemical feature and predicted the number and types of RDKit fragments and RDKit bits present in that structure. These were then used to search the database and propose chemical structures. The average success rate of predicting the right chemical structure ranged from 60 to 86% for the training set and from 48 to 81% for the testing set. These observations suggest that physicochemical fingerprints can assist in the identification of compounds with NTA and substantially improve the number of identified compounds.

摘要

虽然在高分辨率质谱(HRMS)及其在非靶向分析(NTA)中的应用方面已经取得了重要进展,但在生物和环境样本中鉴定的化合物数量通常不超过检测到的化学特征的 5%。我们的目的是开发一种计算管道,利用 HRMS 数据,但也纳入物理化学性质(有机溶剂和水之间的平衡分配比;),并能为检测到的化学特征提出分子结构。由于这些物理化学性质在异构体之间通常有足够的差异,因此将它们放在一起可以为每个异构体形成一个独特的特征,我们称之为“物理化学指纹”。在我们的研究中,我们使用了一个包含以前在人血液中报道过的化合物的综合数据库,并收集了它们在 129 个分配系统中的 值。我们使用 RDKit 计算了 RDKit 片段的数量和每个分子的 RDKit 位的数量。然后,我们开发并训练了一个人工神经网络,该网络将化学特征的物理化学指纹作为输入,并预测该结构中存在的 RDKit 片段和 RDKit 位的数量和类型。然后,这些片段用于搜索数据库并提出化学结构。对于训练集,预测正确化学结构的平均成功率为 60%至 86%,对于测试集,预测正确化学结构的平均成功率为 48%至 81%。这些观察结果表明,物理化学指纹可以帮助识别 NTA 中的化合物,并大大提高鉴定的化合物数量。

相似文献

引用本文的文献

4
Molecular guidelines for promising antimicrobial agents.有前景的抗菌药物分子指南。
Sci Rep. 2024 Feb 26;14(1):4641. doi: 10.1038/s41598-024-55418-6.

本文引用的文献

3
A Comprehensive Non-targeted Analysis Study of the Prenatal Exposome.产前暴露组的全面非靶向分析研究。
Environ Sci Technol. 2021 Aug 3;55(15):10542-10557. doi: 10.1021/acs.est.1c01010. Epub 2021 Jul 14.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验