Yang Qing, Li Yujiao, Li Jie, Zhang Zhiyou, Liu Qiqi, Guo Ge, Wang Shuang, Wang Xiaoyu, Xie Guanghui
Agricultural Equipment Institute of Hunan/Hunan Intelligent Agriculture Engineering Technology Research Center/Hunan Branch Center of National Energy R&D Center for Non-Food Biomass, Changsha 410125, China.
Yueyang Academy of Agriculture Sciences and Researches, Yueyang 414022, China.
ACS Omega. 2025 Apr 8;10(15):14755-14769. doi: 10.1021/acsomega.4c09155. eCollection 2025 Apr 22.
Rapid detection of crop grain components is crucial for effective production and energy conversion. We used the sample set division method to divide multiple sample sets and optimize NIRS models for rapid prediction of protein and fat content. 1243 and 415 crop grain samples were screened and divided into 5 and 4 sets, respectively. The aim was to establish NIRS models for protein and fat content prediction. The best modeling methods for protein were N (Norris Derivative)+D (detrending)-C (CARS)-P (PLS) and N+M (MC-UVE)-C-P, while those for fat were N+M-C-P and N+S (Savitzky-Golay)-C-P. The SS (Soybean Set), KS (Sorghum Set), and FS (Full Samples Set) data sets provided accurate protein content analysis, while the FS and SS data sets were suitable for both protein content prediction and evaluation. For fat, the FS, SS, and CS (Cereal Set) models met content analysis requirements, with the FS model suitable for external validation. It compared and analyzed the fitness, robustness, and accuracy of different NIRS set models, employing various division methods in this study, which provided a new idea of green method theoretical and technical support for major component rapid detection of biomass raw materials.
快速检测作物籽粒成分对于高效生产和能量转换至关重要。我们采用样本集划分方法对多个样本集进行划分,并优化近红外光谱(NIRS)模型以快速预测蛋白质和脂肪含量。分别筛选了1243个和415个作物籽粒样本,并将其分为5组和4组。目的是建立用于预测蛋白质和脂肪含量的NIRS模型。预测蛋白质含量的最佳建模方法是N(诺里斯导数)+D(去趋势)-C(竞争性自适应重加权采样)-P(偏最小二乘法)和N+M(蒙特卡罗无信息变量消除法)-C-P,而预测脂肪含量的最佳建模方法是N+M-C-P和N+S(萨维茨基-戈莱滤波)-C-P。大豆组(SS)、高粱组(KS)和全样本组(FS)数据集能提供准确的蛋白质含量分析,而FS和SS数据集既适用于蛋白质含量预测也适用于评估。对于脂肪,FS、SS和谷物组(CS)模型满足含量分析要求,其中FS模型适用于外部验证。本研究采用多种划分方法,对不同NIRS集模型的拟合度、稳健性和准确性进行了比较分析,为生物质原料主要成分的快速检测提供了绿色方法理论和技术支持的新思路。