Suppr超能文献

深度神经网络建模与基于液相色谱-质谱联用的伪靶向代谢组学相结合以区分易混淆的人参品种。

Integration of deep neural network modeling and LC-MS-based pseudo-targeted metabolomics to discriminate easily confused ginseng species.

作者信息

Jiang Meiting, Sha Yuyang, Zou Yadan, Xu Xiaoyan, Ding Mengxiang, Lian Xu, Wang Hongda, Wang Qilong, Li Kefeng, Guo De-An, Yang Wenzhi

机构信息

State Key Laboratory of Chinese Medicine Modernization, Tianjin University of Traditional Chinese Medicine, Tianjin, 301617, China.

Haihe Laboratory of Modern Chinese Medicine, Tianjin, 301617, China.

出版信息

J Pharm Anal. 2025 Jan;15(1):101116. doi: 10.1016/j.jpha.2024.101116. Epub 2024 Sep 26.

Abstract

Metabolomics covers a wide range of applications in life sciences, biomedicine, and phytology. Data acquisition (to achieve high coverage and efficiency) and analysis (to pursue good classification) are two key segments involved in metabolomics workflows. Various chemometric approaches utilizing either pattern recognition or machine learning have been employed to separate different groups. However, insufficient feature extraction, inappropriate feature selection, overfitting, or underfitting lead to an insufficient capacity to discriminate plants that are often easily confused. Using two ginseng varieties, namely (PJ) and   var. (PJvm), containing the similar ginsenosides, we integrated pseudo-targeted metabolomics and deep neural network (DNN) modeling to achieve accurate species differentiation. A pseudo-targeted metabolomics approach was optimized through data acquisition mode, ion pairs generation, comparison between multiple reaction monitoring (MRM) and scheduled MRM (sMRM), and chromatographic elution gradient. In total, 1980 ion pairs were monitored within 23 min, allowing for the most comprehensive ginseng metabolome analysis. The established DNN model demonstrated excellent classification performance (in terms of accuracy, precision, recall, F1 score, area under the curve, and receiver operating characteristic (ROC)) using the entire metabolome data and feature-selection dataset, exhibiting superior advantages over random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost), and multilayer perceptron (MLP). Moreover, DNNs were advantageous for automated feature learning, nonlinear modeling, adaptability, and generalization. This study confirmed practicality of the established strategy for efficient metabolomics data analysis and reliable classification performance even when using small-volume samples. This established approach holds promise for plant metabolomics and is not limited to ginseng.

摘要

代谢组学在生命科学、生物医学和植物学领域有着广泛的应用。数据采集(以实现高覆盖率和高效率)和分析(以追求良好的分类效果)是代谢组学工作流程中的两个关键环节。利用模式识别或机器学习的各种化学计量学方法已被用于区分不同的组。然而,特征提取不足、特征选择不当、过拟合或欠拟合导致区分那些经常容易混淆的植物的能力不足。我们使用两种人参品种,即(PJ)和变种(PJvm),它们含有相似的人参皂苷,整合了伪靶向代谢组学和深度神经网络(DNN)建模以实现准确的物种区分。通过数据采集模式、离子对生成、多反应监测(MRM)和定时MRM(sMRM)之间的比较以及色谱洗脱梯度对伪靶向代谢组学方法进行了优化。总共在23分钟内监测了1980个离子对,从而实现了最全面的人参代谢组分析。所建立的DNN模型使用整个代谢组数据和特征选择数据集展示了出色的分类性能(在准确性、精确性、召回率、F1分数、曲线下面积和接收器操作特征(ROC)方面),相对于随机森林(RF)、支持向量机(SVM)、极端梯度提升(XGBoost)和多层感知器(MLP)表现出优越的优势。此外,DNN在自动特征学习、非线性建模、适应性和泛化方面具有优势。本研究证实了所建立的策略对于高效代谢组学数据分析和可靠分类性能的实用性,即使使用小体积样本也是如此。这种已建立的方法在植物代谢组学方面具有前景,并且不限于人参。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验