Satriyo Purwana, Munawar Agus Arip
Department of Agribusiness, Faculty of Agriculture, Syiah Kuala University, Banda Aceh, Indonesia.
Department of Agricultural Engineering, Faculty of Agriculture, Syiah Kuala University, Banda Aceh, Indonesia.
Data Brief. 2020 Feb 6;29:105251. doi: 10.1016/j.dib.2020.105251. eCollection 2020 Apr.
Presented manuscript described data analysis on near infrared spectroscopy used as adopted and portable technology for cocoa farmers in Aceh Province, Indonesia. The near infrared spectroscopy (NIRS) assisted farmers in post-harvest handling especially for cocoa quality evaluation. This technology was used to determine moisture content (MC) and fat content (FC) of intact cocoa bean samples rapidly and simultaneously. Near infrared spectra data were acquired as absorbance spectrum in wavelength range from 1000 to 2500 nm with co-added of 32 scans for a total of 72 intact bulk cocoa bean samples. Spectra data can be used to predict MC and FC of intact cocoa beans by establishing prediction models and validate with actual MC and FC measured by means of standard laboratory procedures. Prediction performances were evaluated using several statistical indicators: coefficient correlation (r), coefficient of determination (R), root mean square error (RMSE) and residual predictive deviation (RPD) index. Near infrared spectra data can be enhanced using spectra pre-treatment methods to improve prediction performances. Moreover, prediction models can be developed using principal component regression (PCR), partial least squares regression (PLSR) and other regression approaches. Ideal prediction models should have r and R above 0.75, RPD index above 2.0 and RMSE lower than its standard deviation (SD). Dataset were available as raw MS Excel format and files as extension.
所提交的手稿描述了对近红外光谱技术的数据分析,该技术被用作印度尼西亚亚齐省可可农采用的便携式技术。近红外光谱技术(NIRS)协助农民进行收获后处理,特别是用于可可质量评估。这项技术用于快速同时测定完整可可豆样品的水分含量(MC)和脂肪含量(FC)。在1000至2500纳米波长范围内采集近红外光谱数据作为吸光度光谱,对总共72个完整的散装可可豆样品进行32次扫描叠加。通过建立预测模型并与采用标准实验室程序测量的实际MC和FC进行验证,光谱数据可用于预测完整可可豆的MC和FC。使用几个统计指标评估预测性能:相关系数(r)、决定系数(R)、均方根误差(RMSE)和剩余预测偏差(RPD)指数。可使用光谱预处理方法增强近红外光谱数据,以提高预测性能。此外,可使用主成分回归(PCR)、偏最小二乘回归(PLSR)及其他回归方法开发预测模型。理想的预测模型应具有r和R大于0.75、RPD指数大于2.0且RMSE低于其标准差(SD)。数据集以原始MS Excel格式提供,文件扩展名为 。