Suppr超能文献

近红外光谱结合机器学习快速测定肉苁蓉多糖。

Rapid Determination of Polysaccharides in Cistanche Tubulosa Using Near-Infrared Spectroscopy Combined with Machine Learning.

机构信息

School of Pharmacy, Xinjiang Medical University, Xinyi Road, Urumqi 830011, China.

Key Laboratory of Active Components of Xinjiang Natural Medicine and Drug Release Technology, Xinyi Road, Urumqi 830011, China.

出版信息

J AOAC Int. 2023 Jul 17;106(4):1118-1125. doi: 10.1093/jaoacint/qsac144.

Abstract

BACKGROUND

Cistanche tubulosa, as a homology of medicine and food, not only has a unique medicinal value but also is widely used in healthcare products. Polysaccharide is one of its important quality indicators.

OBJECTIVE

In this study, an analytical model based on near-infrared (NIR) spectroscopy combined with machine learning was established to predict the polysaccharide content of C. tubulosa.

METHODS

The polysaccharide content in the samples determined by the phenol-sulfuric acid method was used as a reference value, and machine learning was applied to relate the spectral information to the reference value. Dividing the samples into a calibration set and a prediction set using the Kennard-Stone algorithm. The model was optimized by various preprocessing methods, including Savitzky-Golay (SG), standard normal variate (SNV), multiple scattering correction (MSC), first-order derivative (FD), second-order derivative (SD), and combinations of them. Variable selection was performed through the successive projections algorithm (SPA) and stability competitive adaptive reweighted sampling (sCARS). Four machine learning models were used to build quantitative models, including the random forest (RF), partial least-squares (PLS), principal component regression (PCR), and support vector machine (SVM). The evaluation indexes of the model were the coefficient of determination (R2), root-mean-square error (RMSE), and residual prediction deviation (RPD).

RESULTS

RF performs best among the four machine learning models. R2c (calibration set coefficient of determination) and RMSEC (root mean square error of the calibration set), %, were 0.9763. and 0.3527 for calibration, respectively. R2p (prediction set coefficient of determination), RMSEP (root mean square error of the prediction set), %, and RPD were 0.9230, 0.5130, and 3.33 for prediction, respectively.

CONCLUSION

The results indicate that NIR combined with the RF is an effective method applied to the quality evaluation of the polysaccharides of C. tubulosa.

HIGHLIGHTS

Four quantitative models were developed to predict the polysaccharide content in C. tubulosa, and good results were obtained. The characteristic variables were basically determined by the sCARS algorithm, and the corresponding characteristic groups were analyzed.

摘要

背景

肉苁蓉作为一种药食同源的植物,不仅具有独特的药用价值,而且在保健品中得到了广泛的应用。多糖是其重要的质量指标之一。

目的

本研究建立了基于近红外(NIR)光谱结合机器学习的分析模型,以预测肉苁蓉的多糖含量。

方法

采用苯酚-硫酸法测定样品的多糖含量作为参考值,应用机器学习将光谱信息与参考值相关联。采用 Kennard-Stone 算法将样品分为校准集和预测集。通过 Savitzky-Golay(SG)、标准正态变量(SNV)、多次散射校正(MSC)、一阶导数(FD)、二阶导数(SD)及其组合等多种预处理方法对模型进行优化。通过连续投影算法(SPA)和稳定竞争自适应重加权采样(sCARS)进行变量选择。采用随机森林(RF)、偏最小二乘(PLS)、主成分回归(PCR)和支持向量机(SVM)四种机器学习模型建立定量模型。模型的评价指标为决定系数(R2)、均方根误差(RMSE)和残差预测偏差(RPD)。

结果

在四种机器学习模型中,RF 表现最佳。校准集的决定系数(R2c)和 RMSEC(校准集均方根误差)分别为 0.9763 和 0.3527。预测集的决定系数(R2p)、RMSEP(预测集均方根误差)和 RPD 分别为 0.9230、0.5130 和 3.33。

结论

结果表明,NIR 结合 RF 是一种应用于肉苁蓉多糖质量评价的有效方法。

亮点

建立了四种定量模型来预测肉苁蓉的多糖含量,取得了较好的结果。特征变量基本由 sCARS 算法确定,并对相应的特征组进行了分析。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验