Zhang Guifang, Liu Jinming, Li Zhiming, Li Nuo, Zhang Dongjie
National Coarse Cereal Engineering Technology Research Center, Heilongjiang Bayi Agricultural University, Daqing, 163319, Heilongjiang, People's Republic of China.
College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Daqing, 163319, Heilongjiang, People's Republic of China.
Sci Rep. 2025 Feb 18;15(1):5848. doi: 10.1038/s41598-024-83894-3.
An origin discrimination model for rice from five production regions in Heilongjiang Province was constructed based on the combination of confocal microscopy Raman spectroscopy and chemometrics. A total of 150 field rice samples were collected from the Fangzheng, Chahayang, Jiansanjiang, Xiangshui, and Wuchang production areas. The optimal sample processing conditions, instrument parameter settings, and spectrum acquisition techniques were identified by investigating the influencing factor. The Raman spectra of milled rice within the range of 100-3200 cm were selected as the raw data, and the optimal preprocessing method combination consisting of normalization, Savitzky-Golay smoothing, and multivariate scatter correction was identified. Subsequently, the competitive adaptive reweighted sampling and discrete binary particle swarm optimization algorithms were employed to optimize the feature wavelength selection, resulting in the screening of 226 and 1899 feature wavelength variables, respectively. Using the full Raman spectrum data and feature wavelength data as inputs, partial least squares discriminant analysis, support vector machine and extreme learning machine origin discrimination models were constructed. The results indicated that the BPSO-GA-SVM model exhibited the best predictive ability, achieving a testing set accuracy of 86.67%.
基于共聚焦显微镜拉曼光谱和化学计量学的结合,构建了黑龙江省五个产区水稻的产地判别模型。从方正、查哈阳、建三江、响水和五常产区共采集了150份田间水稻样本。通过研究影响因素,确定了最佳的样本处理条件、仪器参数设置和光谱采集技术。选择100 - 3200 cm范围内的精米拉曼光谱作为原始数据,并确定了由归一化、Savitzky-Golay平滑和多元散射校正组成的最佳预处理方法组合。随后,采用竞争性自适应重加权采样和离散二进制粒子群优化算法对特征波长选择进行优化,分别筛选出226个和1899个特征波长变量。以全拉曼光谱数据和特征波长数据为输入,构建了偏最小二乘判别分析、支持向量机和极限学习机产地判别模型。结果表明,BPSO-GA-SVM模型具有最佳的预测能力,测试集准确率达到86.67%。