Pokhrel Dharma Raj, Sirisomboon Panmanas, Khurnpoon Lampan, Posom Jetsada, Saechua Wanphut
Department of Agricultural Engineering, School of Engineering, King Mongkut's Institute of Technology Ladkrabang, Bangkok 10520, Thailand.
School of Agricultural Technology, King Mongkut's Institute of Technology Ladkrabang, Bangkok 10520, Thailand.
Sensors (Basel). 2023 Jun 4;23(11):5327. doi: 10.3390/s23115327.
The aim of this study was to evaluate and compare the performance of multivariate classification algorithms, specifically Partial Least Squares Discriminant Analysis (PLS-DA) and machine learning algorithms, in the classification of Monthong durian pulp based on its dry matter content (DMC) and soluble solid content (SSC), using the inline acquisition of near-infrared (NIR) spectra. A total of 415 durian pulp samples were collected and analyzed. Raw spectra were preprocessed using five different combinations of spectral preprocessing techniques: Moving Average with Standard Normal Variate (MA+SNV), Savitzky-Golay Smoothing with Standard Normal Variate (SG+SNV), Mean Normalization (SG+MN), Baseline Correction (SG+BC), and Multiplicative Scatter Correction (SG+MSC). The results revealed that the SG+SNV preprocessing technique produced the best performance with both the PLS-DA and machine learning algorithms. The optimized wide neural network algorithm of machine learning achieved the highest overall classification accuracy of 85.3%, outperforming the PLS-DA model, with overall classification accuracy of 81.4%. Additionally, evaluation metrics such as recall, precision, specificity, F1-score, AUC ROC, and kappa were calculated and compared between the two models. The findings of this study demonstrate the potential of machine learning algorithms to provide similar or better performance compared to PLS-DA in classifying Monthong durian pulp based on DMC and SSC using NIR spectroscopy, and they can be applied in the quality control and management of durian pulp production and storage.
本研究的目的是评估和比较多元分类算法,特别是偏最小二乘判别分析(PLS-DA)和机器学习算法,在基于干物质含量(DMC)和可溶性固形物含量(SSC)对尖竹汶榴莲果肉进行分类时的性能,采用在线采集近红外(NIR)光谱的方法。总共收集并分析了415个榴莲果肉样本。原始光谱使用五种不同的光谱预处理技术组合进行预处理:移动平均与标准正态变量变换(MA+SNV)、Savitzky-Golay平滑与标准正态变量变换(SG+SNV)、均值归一化(SG+MN)、基线校正(SG+BC)和乘法散射校正(SG+MSC)。结果表明,SG+SNV预处理技术在PLS-DA和机器学习算法中均表现出最佳性能。机器学习的优化宽神经网络算法实现了最高的总体分类准确率85.3%,优于PLS-DA模型,其总体分类准确率为81.4%。此外,还计算并比较了两个模型之间的召回率、精确率、特异性、F1分数、AUC ROC和kappa等评估指标。本研究结果表明,在使用近红外光谱基于DMC和SSC对尖竹汶榴莲果肉进行分类时,机器学习算法具有与PLS-DA相似或更好性能的潜力,并且可应用于榴莲果肉生产和储存的质量控制与管理。