Béjar-Grimalt Jaume, Pérez-Guaita David, Sánchez-Illana Ángel, García-Contreras Rodolfo, Kataria Rashmi, Bureau Sylvie, de la Guardia Miguel, Cadet Frédéric
Department of Analytical Chemistry, University of Valencia, 46100 Burjassot, Spain.
Departamento de Microbiología y Parasitología, Facultad de Medicina, Universidad Nacional Autonoma de Mexico, 04510 Mexico City, Mexico.
ACS Agric Sci Technol. 2025 Jul 8;5(7):1373-1381. doi: 10.1021/acsagscitech.5c00068. eCollection 2025 Jul 21.
This work aimed to investigate using ATR-FTIR spectroscopy combined with machine learning to classify eight apricot varieties. Traditionally, variety identification relies on physicochemical property measurements, which are time-consuming and require laboratory analysis. Instead, we used the ATR-FTIR spectra from 731 apricots divided into calibration (512) and test (219) sets and three machine learning models (i.e., partial least-squares-discriminant analysis (PLS-DA), support vector machine (SVM), and random forest (RF)) to accurately predict 97% of the test samples. Additionally, careful inspection of the PLS-DA regression vectors revealed a strong correlation between the spectra and biochemical composition in sugar and organic acids, validating ATR-FTIR spectroscopy as a viable alternative for variety identification. Finally, to validate the results, additional models were constructed using the physicochemical data from the apricots. These reference models were then tested using the same data splits as the spectroscopic data used as a reference method, obtaining similar results with both approaches.
这项工作旨在研究使用衰减全反射傅里叶变换红外光谱(ATR-FTIR)结合机器学习对八个杏子品种进行分类。传统上,品种鉴定依赖于物理化学性质测量,这既耗时又需要实验室分析。相反,我们使用了来自731个杏子的ATR-FTIR光谱,这些杏子被分为校准集(512个)和测试集(219个),并使用三种机器学习模型(即偏最小二乘判别分析(PLS-DA)、支持向量机(SVM)和随机森林(RF))来准确预测97%的测试样本。此外,对PLS-DA回归向量的仔细检查揭示了光谱与糖和有机酸中的生化成分之间的强相关性,验证了ATR-FTIR光谱作为品种鉴定的可行替代方法。最后,为了验证结果,使用杏子的物理化学数据构建了额外的模型。然后使用与用作参考方法的光谱数据相同的数据划分对这些参考模型进行测试,两种方法都获得了相似的结果。