基于分层争议模型的新型机器学习框架，用于使用多模式光谱识别鱼类物种。

A Novel Machine-Learning Framework Based on a Hierarchy of Dispute Models for the Identification of Fish Species Using Multi-Mode Spectroscopy.

机构信息

Biomedical Engineering Program, University of North Dakota, Grand Forks, ND 58202, USA.

SafetySpect Inc., Grand Forks, ND 58202, USA.

出版信息

Sensors (Basel). 2023 Nov 9;23(22):9062. doi: 10.3390/s23229062.

DOI:10.3390/s23229062

PMID:38005450

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10674920/

Abstract

Seafood mislabeling rates of approximately 20% have been reported globally. Traditional methods for fish species identification, such as DNA analysis and polymerase chain reaction (PCR), are expensive and time-consuming, and require skilled technicians and specialized equipment. The combination of spectroscopy and machine learning presents a promising approach to overcome these challenges. In our study, we took a comprehensive approach by considering a total of 43 different fish species and employing three modes of spectroscopy: fluorescence (Fluor), and reflectance in the visible near-infrared (VNIR) and short-wave near-infrared (SWIR). To achieve higher accuracies, we developed a novel machine-learning framework, where groups of similar fish types were identified and specialized classifiers were trained for each group. The incorporation of global (single artificial intelligence for all species) and dispute classification models created a hierarchical decision process, yielding higher performances. For Fluor, VNIR, and SWIR, accuracies increased from 80%, 75%, and 49% to 83%, 81%, and 58%, respectively. Furthermore, certain species witnessed remarkable performance enhancements of up to 40% in single-mode identification. The fusion of all three spectroscopic modes further boosted the performance of the best single mode, averaged over all species, by 9%. Fish species mislabeling not only poses health-related risks due to contaminants, toxins, and allergens that could be life-threatening, but also gives rise to economic and environmental hazards and loss of nutritional benefits. Our proposed method can detect fish fraud as a real-time alternative to DNA barcoding and other standard methods. The hierarchical system of dispute models proposed in this work is a novel machine-learning tool not limited to this application, and can improve accuracy in any classification problem which contains a large number of classes.

摘要

全球范围内，海鲜标签误贴率约为 20%。传统的鱼类物种鉴定方法，如 DNA 分析和聚合酶链反应（PCR），既昂贵又耗时，且需要技术熟练的技术人员和专用设备。光谱学和机器学习的结合为克服这些挑战提供了一个有前途的方法。在我们的研究中，我们采用了综合方法，考虑了总共 43 种不同的鱼类，并采用了三种光谱模式：荧光（Fluor）、可见光近红外（VNIR）和短波近红外（SWIR）的反射率。为了获得更高的准确率，我们开发了一种新颖的机器学习框架，其中识别出相似鱼类类型的组，并为每个组训练专门的分类器。引入全球（所有物种的单一人工智能）和争议分类模型创建了分层决策过程，从而提高了性能。对于 Fluor、VNIR 和 SWIR，准确率从 80%、75%和 49%分别提高到 83%、81%和 58%。此外，某些物种在单模式识别中的性能提高了高达 40%。将所有三种光谱模式融合在一起，进一步提高了所有物种平均最佳单模式的性能，提高了 9%。鱼类标签误贴不仅会因污染物、毒素和过敏原而带来健康风险，这些物质可能危及生命，还会带来经济和环境危害以及营养益处的损失。我们提出的方法可以作为 DNA 条形码和其他标准方法的实时替代品来检测鱼类欺诈。本工作中提出的分层争议模型系统是一种新颖的机器学习工具，不仅限于此应用，它可以提高任何包含大量类别的分类问题的准确率。