1 Department of Chemistry, Universidade Federal de Viçosa. 36570-900, Viçosa, Minas Gerais, Brazil.
2 Department of Plant Science, Universidade Federal de Viçosa, 36570-900, Viçosa, Minas Gerais, Brazil.
Appl Spectrosc. 2017 Aug;71(8):2001-2012. doi: 10.1177/0003702817704147. Epub 2017 Apr 28.
The building of multivariate calibration models using near-infrared spectroscopy (NIR) and partial least squares (PLS) to estimate the lignin content in different parts of sugarcane genotypes is presented. Laboratory analyses were performed to determine the lignin content using the Klason method. The independent variables were obtained from different materials: dry bagasse, bagasse-with-juice, leaf, and stalk. The NIR spectra in the range of 10 000-4000 cm were obtained directly for each material. The models were built using PLS regression, and different algorithms for variable selection were tested and compared: iPLS, biPLS, genetic algorithm (GA), and the ordered predictors selection method (OPS). The best models were obtained by feature selection with the OPS algorithm. The values of the root mean square error prediction (RMSEP), correlation of prediction ( R), and ratio of performance to deviation (RPD) were, respectively, for dry bagasse equal to 0.85, 0.97, and 2.87; for bagasse-with-juice equal to 0.65, 0.94, and 2.77; for leaf equal to 0.58, 0.96, and 2.56; for the middle stalk equal to 0.61, 0.95, and 3.24; and for the top stalk equal to 0.58, 0.96, and 2.34. The OPS algorithm selected fewer variables, with greater predictive capacity. All the models are reliable, with high accuracy for predicting lignin in sugarcane, and significantly reduce the time to perform the analysis, the cost and the chemical reagent consumption, thus optimizing the entire process. In general, the future application of these models will have a positive impact on the biofuels industry, where there is a need for rapid decision-making regarding clone production and genetic breeding program.
本文介绍了使用近红外光谱(NIR)和偏最小二乘法(PLS)构建多元校准模型,以估算不同甘蔗基因型中木质素含量的方法。通过 Klason 法进行实验室分析来确定木质素含量。自变量来自不同的材料:干蔗渣、蔗渣带汁、叶片和茎。在每个材料中直接获得了范围在 10,000-4000 cm 的 NIR 光谱。使用 PLS 回归构建模型,并测试和比较了不同的变量选择算法:iPLS、biPLS、遗传算法(GA)和有序预测器选择方法(OPS)。通过 OPS 算法的特征选择获得了最佳模型。干蔗渣的预测均方根误差(RMSEP)、预测相关系数(R)和偏差比(RPD)值分别为 0.85、0.97 和 2.87;蔗渣带汁的 RMSEP、R 和 RPD 值分别为 0.65、0.94 和 2.77;叶片的 RMSEP、R 和 RPD 值分别为 0.58、0.96 和 2.56;中茎的 RMSEP、R 和 RPD 值分别为 0.61、0.95 和 3.24;顶茎的 RMSEP、R 和 RPD 值分别为 0.58、0.96 和 2.34。OPS 算法选择的变量更少,具有更大的预测能力。所有模型都具有很高的可靠性,对甘蔗木质素的预测精度很高,并且显著减少了分析时间、成本和化学试剂消耗,从而优化了整个过程。总的来说,这些模型的未来应用将对生物燃料行业产生积极影响,因为需要快速决策关于克隆生产和遗传育种计划。