Department of Chemistry, Loyola Science Center, The University of Scranton, Scranton, PA 18510, USA.
Department of Mathematics and Physical Sciences, Louisiana State University - Alexandria, Alexandria, LA 71302, USA.
Spectrochim Acta A Mol Biomol Spectrosc. 2022 Aug 5;276:121186. doi: 10.1016/j.saa.2022.121186. Epub 2022 Mar 29.
Facile, robust, and accurate analyses of honey adulterants are required in the honey industry to assess its purity for commercialization purposes. A stacked regression ensemble approach using Fourier transform infrared spectroscopic method was developed for the quantitative determination of corn, cane, beet, and rice syrup adulterants in honey. A training set (n=81) was used to predict the percent adulterant composition of the aforementioned constituents in an independent test set (n=32). A comprehensive comparison of the performance of various machine learning techniques including support vector regression using linear function, least absolute shrinkage and selection operator, ride regression, elastic net, partial least squares, random forests, recursive partitioning and regression trees, gradient boosting, and gaussian process regression was assessed. The predictive performance of the aforementioned machine learning approaches was then compared with stacked regression, an ensemble learning technique which collates the performance of the various abovementioned techniques. Results show that stacked regression did not primarily outperform other techniques across all four syrup adulterant constituents in the testing set data. Further, elastic net generalized linear model generated the optimum results (Rootmeansquareerrorofprediction(RMSEP)=0.0107,R=0.809) across all four honey adulterant constituents. Elastic net coupled with Fourier transform infrared spectroscopy may offer a novel, direct, and accurate method of simultaneously quantifying corn, cane, beet, and rice syrup adulterants in honey.
在蜂蜜行业中,需要一种简单、稳健且准确的方法来分析蜂蜜中的掺杂物,以评估其商业化的纯度。本研究采用傅里叶变换红外光谱法建立了堆叠回归集成方法,用于定量测定蜂蜜中玉米、甘蔗、甜菜和大米糖浆的掺杂物。使用训练集(n=81)预测上述成分在独立测试集(n=32)中的掺杂物组成百分比。综合比较了各种机器学习技术的性能,包括使用线性函数、最小绝对值收缩和选择算子、岭回归、弹性网络、偏最小二乘、随机森林、递归分区和回归树、梯度提升和高斯过程回归的支持向量回归。然后将上述机器学习方法的预测性能与堆叠回归进行比较,堆叠回归是一种集成学习技术,可整合各种上述技术的性能。结果表明,在测试集数据中,堆叠回归并没有在所有四个糖浆掺杂物成分上都优于其他技术。此外,弹性网络广义线性模型在所有四个蜂蜜掺杂物成分上都产生了最佳结果(预测均方根误差(RMSEP)=0.0107,R=0.809)。弹性网络与傅里叶变换红外光谱的结合可能为同时定量测定蜂蜜中的玉米、甘蔗、甜菜和大米糖浆掺杂物提供了一种新颖、直接和准确的方法。