Xun Zhuoran, Wang Xuemeng, Xue Hao, Zhang Qingzheng, Yang Wanqi, Zhang Hua, Li Mingzhu, Jia Shangang, Qu Jiangyong, Wang Xumin
College of Life Sciences, Yantai University, Yantai, 264005, China.
College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, China.
Curr Res Food Sci. 2024 Jun 14;9:100784. doi: 10.1016/j.crfs.2024.100784. eCollection 2024.
Food fraud is widespread in the aquatic food market, hence fast and non-destructive methods of identification of fish flesh are needed. In this study, multispectral imaging (MSI) was used to screen flesh slices from 20 edible fish species commonly found in the sea around Yantai, China, by combining identification based on the mitochondrial gene. We found that nCDA images transformed from MSI data showed significant differences in flesh splices of the 20 fish species. We then employed eight models to compare their prediction performances based on the hold-out method with 70% training and 30% test sets. Convolutional neural network (CNN), quadratic discriminant analysis (QDA), support vector machine (SVM), and linear discriminant analysis (LDA) models perform well on cross-validation and test data. CNN and QDA achieved more than 99% accuracy on the test set. By extracting the CNN features for optimization, a very high degree of separation was obtained for all species. Furthermore, based on the Gini index in RF, 11 bands were selected as key classification features for CNN, and an accuracy of 98% was achieved. Our study developed a successful pipeline for employing machine learning models (especially CNN) on MSI identification of fish flesh, and provided a convenient and non-destructive method to determine the marketing of fish flesh in the future.
食品欺诈在水产市场中普遍存在,因此需要快速且无损的鱼肉鉴别方法。在本研究中,通过结合基于线粒体基因的鉴别方法,利用多光谱成像(MSI)对中国烟台附近海域常见的20种可食用鱼类的鱼片进行筛选。我们发现,由MSI数据转换而来的归一化颜色差异(nCDA)图像在这20种鱼类的鱼片上呈现出显著差异。然后,我们采用八种模型,基于留出法(70%训练集和30%测试集)比较它们的预测性能。卷积神经网络(CNN)、二次判别分析(QDA)、支持向量机(SVM)和线性判别分析(LDA)模型在交叉验证和测试数据上表现良好。CNN和QDA在测试集上的准确率超过了99%。通过提取CNN特征进行优化,所有物种都实现了非常高的分离度。此外,基于随机森林(RF)中的基尼指数,选择了11个波段作为CNN的关键分类特征,实现了98%的准确率。我们的研究开发了一种成功的流程,用于在MSI鱼肉鉴别中应用机器学习模型(尤其是CNN),并为未来鱼肉市场销售鉴定提供了一种便捷且无损的方法。