Suppr超能文献

基于气相色谱-质谱联用技术和机器学习算法的洋县有色稻品种挥发性代谢产物的表征及特征选择

Characterization and feature selection of volatile metabolites in Yangxian pigmented rice varieties through GC-MS and machine learning algorithms.

作者信息

Cheng Kaiqi, Dong Ruonan, Pan Fei, Su Wen, Xi Lingjie, Zhang Meng, Geng Jingzhang, Gao Ruichang, Jin Wengang, Abd El-Aty A M

机构信息

Qinba State Key Laboratory of Biological Resources and Ecological Environment, QinLing-Bashan Moun-tains Bioresources Comprehensive Development 2011 C. I. C, Shaanxi Province Key Laboratory of Bio-Resources, College of Bioscience and Bioengineering Shaanxi University of Technology, Hanzhong, China.

Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China.

出版信息

Front Nutr. 2025 May 20;12:1598875. doi: 10.3389/fnut.2025.1598875. eCollection 2025.

Abstract

INTRODUCTION

Pigmented rice is fascinated by consumers for its abundant phytochemicals and unique aroma.

METHODS

In this study, GC-MS-based metabolomics of Yangxian colored rice varieties were performed to characterize their volatile metabolites through multivariate statistics and machine learning algorithms.

RESULTS

Results showed that a total of 357 volatile metabolites were detected and segmented into 9 groups, including 96 organooxygen compounds (26.89%), 52 carboxylic acids and derivatives (14.57%), 42 fatty acyls (11.76%), 16 benzene and substituted derivatives (4.48%), and 11 hydroxy acids and derivatives (3.08%). Multivariate statistics screened 127 differentially abundant metabolites via PLS-DA. Principal component analysis revealed that the percentages of PC1 and PC2 were 52.48% and 27.09%, respectively. Based on differential metabolites with great multicollinearity above 0.8 and the chi-square test (20% feature numbers), only 7 metabolites were found to represent the overall metabolites among the several colored rice varieties. Four machine learning models were further used for the classification of various colored rice varieties, and random forest model was the optimum for predicting classification, with an accuracy of 0.97. Moreover, Shapley additive explanations analysis revealed that the 7 metabolites can be used as potential markers for representing the metabolomic profiles.

CONCLUSIONS

These results implied that GC-MS-based metabolomics combined with random forest might be effective for extracting key features among different pigmented rice varieties.

摘要

引言

有色稻米因其丰富的植物化学物质和独特的香气而受到消费者的青睐。

方法

在本研究中,对洋县有色水稻品种进行了基于气相色谱-质谱联用的代谢组学分析,以通过多变量统计和机器学习算法表征其挥发性代谢产物。

结果

结果表明,共检测到357种挥发性代谢产物,并分为9组,包括96种有机氧化合物(26.89%)、52种羧酸及其衍生物(14.57%)、42种脂肪酰基(11.76%)、16种苯及其取代衍生物(4.48%)和11种羟基酸及其衍生物(3.08%)。多变量统计通过偏最小二乘判别分析筛选出127种差异丰富的代谢产物。主成分分析显示,PC1和PC2的百分比分别为52.48%和27.09%。基于多重共线性大于0.8的差异代谢产物和卡方检验(20%的特征数量),在几个有色水稻品种中仅发现7种代谢产物可代表总体代谢产物。进一步使用四种机器学习模型对各种有色水稻品种进行分类,随机森林模型是预测分类的最佳模型,准确率为0.97。此外,夏普利值分析表明,这7种代谢产物可作为代表代谢组学特征的潜在标志物。

结论

这些结果表明,基于气相色谱-质谱联用的代谢组学结合随机森林可能有效地提取不同有色水稻品种之间的关键特征。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验