Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, Højbakkegård Allé 13, 2630, Taastrup, Denmark.
J Sci Food Agric. 2013 Dec;93(15):3710-9. doi: 10.1002/jsfa.6207. Epub 2013 Jun 7.
Visible-near infrared spectroscopy remains a method of increasing interest as a fast alternative for the evaluation of fruit quality. The success of the method is assumed to be achieved by using large sets of samples to produce robust calibration models. In this study we used representative samples of an early and a late season apple cultivar to evaluate model robustness (in terms of prediction ability and error) on the soluble solids content (SSC) and acidity prediction, in the wavelength range 400-1100 nm.
A total of 196 middle-early season and 219 late season apples (Malus domestica Borkh.) cvs 'Aroma' and 'Holsteiner Cox' samples were used to construct spectral models for SSC and acidity. Partial least squares (PLS), ridge regression (RR) and elastic net (EN) models were used to build prediction models. Furthermore, we compared three sub-sample arrangements for forming training and test sets ('smooth fractionator', by date of measurement after harvest and random). Using the 'smooth fractionator' sampling method, fewer spectral bands (26) and elastic net resulted in improved performance for SSC models of 'Aroma' apples, with a coefficient of variation CVSSC = 13%. The model showed consistently low errors and bias (PLS/EN: R(2) cal = 0.60/0.60; SEC = 0.88/0.88°Brix; Biascal = 0.00/0.00; R(2) val = 0.33/0.44; SEP = 1.14/1.03; Biasval = 0.04/0.03). However, the prediction acidity and for SSC (CV = 5%) of the late cultivar 'Holsteiner Cox' produced inferior results as compared with 'Aroma'.
It was possible to construct local SSC and acidity calibration models for early season apple cultivars with CVs of SSC and acidity around 10%. The overall model performance of these data sets also depend on the proper selection of training and test sets. The 'smooth fractionator' protocol provided an objective method for obtaining training and test sets that capture the existing variability of the fruit samples for construction of visible-NIR prediction models. The implication is that by using such 'efficient' sampling methods for obtaining an initial sample of fruit that represents the variability of the population and for sub-sampling to form training and test sets it should be possible to use relatively small sample sizes to develop spectral predictions of fruit quality. Using feature selection and elastic net appears to improve the SSC model performance in terms of R(2), RMSECV and RMSEP for 'Aroma' apples.
可见-近红外光谱仍然是一种很有前途的方法,可作为快速替代方法来评估水果品质。该方法的成功被认为是通过使用大量样本来生成稳健的校准模型来实现的。在这项研究中,我们使用了早季和晚季苹果品种的代表性样本,在 400-1100nm 的波长范围内评估了可溶性固形物含量(SSC)和酸度预测的模型稳健性(预测能力和误差)。
共使用了 196 个中早季和 219 个晚季苹果(Malus domestica Borkh.)品种‘Aroma’和‘Holsteiner Cox’的样本,用于构建 SSC 和酸度的光谱模型。偏最小二乘(PLS)、岭回归(RR)和弹性网络(EN)模型用于构建预测模型。此外,我们比较了三种用于形成训练集和测试集的子样本排列方式(“平滑分馏器”,根据收获后测量的日期和随机排列)。使用“平滑分馏器”采样方法,较少的光谱波段(26 个)和弹性网络导致‘Aroma’苹果 SSC 模型的性能得到改善,变异系数 CVSSC = 13%。该模型表现出一致的低误差和偏差(PLS/EN:R² cal = 0.60/0.60;SEC = 0.88/0.88°Brix;Biascal = 0.00/0.00;R² val = 0.33/0.44;SEP = 1.14/1.03;Biasval = 0.04/0.03)。然而,与‘Aroma’相比,晚熟品种‘Holsteiner Cox’的预测酸度和 SSC(变异系数为 5%)的结果较差。
可以为早季苹果品种构建 SSC 和酸度的局部校准模型,其 SSC 和酸度的变异系数约为 10%。这些数据集的整体模型性能还取决于训练集和测试集的正确选择。“平滑分馏器”方案提供了一种客观的方法来获取训练集和测试集,以捕获水果样本的现有变异性,从而构建可见-近红外预测模型。这意味着,通过使用这种“高效”的采样方法获得代表总体变异性的初始水果样本,并对子样本进行采样以形成训练集和测试集,应该可以使用相对较小的样本量来开发水果品质的光谱预测。对于‘Aroma’苹果,使用特征选择和弹性网络似乎可以提高 SSC 模型的性能,提高 R²、RMSECV 和 RMSEP。