Faculty of Information Technology and Communication Sciences, Tampere University, 33720 Tampere, Finland.
R&D and Innovation Services, Tampere University of Applied Sciences, 33230 Tampere, Finland.
Sensors (Basel). 2021 Jan 7;21(2):361. doi: 10.3390/s21020361.
The present aim was to compare the accuracy of several algorithms in classifying data collected from food scent samples. Measurements using an electronic nose (eNose) can be used for classification of different scents. An eNose was used to measure scent samples from seven food scent sources, both from an open plate and a sealed jar. The -Nearest Neighbour (-NN) classifier provides reasonable accuracy under certain conditions and uses traditionally the Euclidean distance for measuring the similarity of samples. Therefore, it was used as a baseline distance metric for the -NN in this paper. Its classification accuracy was compared with the accuracies of the -NN with 66 alternative distance metrics. In addition, 18 other classifiers were tested with raw eNose data. For each classifier various parameter settings were tried and compared. Overall, 304 different classifier variations were tested, which differed from each other in at least one parameter value. The results showed that Quadratic Discriminant Analysis, MLPClassifier, C-Support Vector Classification (SVC), and several different single hidden layer Neural Networks yielded lower misclassification rates applied to the raw data than -NN with Euclidean distance. Both MLP Classifiers and SVC yielded misclassification rates of less than 3% when applied to raw data. Furthermore, when applied both to the raw data and the data preprocessed by principal component analysis that explained at least 95% or 99% of the total variance in the raw data, Quadratic Discriminant Analysis outperformed the other classifiers. The findings of this study can be used for further algorithm development. They can also be used, for example, to improve the estimation of storage times of fruit.
本研究旨在比较几种算法在分类食物气味样本数据方面的准确性。使用电子鼻(eNose)进行测量可以用于对不同气味的分类。eNose 被用于测量来自七种食物气味源的气味样本,包括来自开放板和密封罐的样本。最近邻(-NN)分类器在某些条件下提供了合理的准确性,并且传统上使用欧几里得距离来测量样本的相似性。因此,它被用作本文中 -NN 的基准距离度量。将其分类准确性与使用 66 种替代距离度量的 -NN 的准确性进行了比较。此外,还使用原始 eNose 数据测试了 18 种其他分类器。对于每种分类器,尝试了各种参数设置并进行了比较。总体而言,测试了 304 种不同的分类器变体,这些变体在至少一个参数值上彼此不同。结果表明,二次判别分析、多层感知器分类器、支持向量机(SVC)和几种不同的单隐藏层神经网络在应用于原始数据时比欧几里得距离的 -NN 产生的错误分类率更低。当应用于原始数据时,MLP 分类器和 SVC 都产生了低于 3%的错误分类率。此外,当应用于原始数据和通过主成分分析预处理的数据(该分析至少解释了原始数据总方差的 95%或 99%)时,二次判别分析优于其他分类器。本研究的结果可用于进一步的算法开发。它们还可以用于例如改善对水果储存时间的估计。