用于食品安全应用的光谱学方法：利用主动学习和半监督学习提高数据效率

Spectroscopy Approaches for Food Safety Applications: Improving Data Efficiency Using Active Learning and Semi-supervised Learning.

作者信息

Zhang Huanle, Wisuthiphaet Nicharee, Cui Hemiao, Nitin Nitin, Liu Xin, Zhao Qing

机构信息

Department of Computer Science, University of California, Davis, Davis, CA, United States.

Department of Food Science and Technology, University of California, Davis, Davis, CA, United States.

出版信息

Front Artif Intell. 2022 Jun 22;5:863261. doi: 10.3389/frai.2022.863261. eCollection 2022.

DOI:10.3389/frai.2022.863261

PMID:35814488

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9257238/

Abstract

The past decade witnessed rapid development in the measurement and monitoring technologies for food science. Among these technologies, spectroscopy has been widely used for the analysis of food quality, safety, and nutritional properties. Due to the complexity of food systems and the lack of comprehensive predictive models, rapid and simple measurements to predict complex properties in food systems are largely missing. Machine Learning (ML) has shown great potential to improve the classification and prediction of these properties. However, the barriers to collecting large datasets for ML applications still persists. In this paper, we explore different approaches of data annotation and model training to improve data efficiency for ML applications. Specifically, we leverage Active Learning (AL) and Semi-Supervised Learning (SSL) and investigate four approaches: baseline passive learning, AL, SSL, and a hybrid of AL and SSL. To evaluate these approaches, we collect two spectroscopy datasets: predicting plasma dosage and detecting foodborne pathogen. Our experimental results show that, compared to the passive learning approach, advanced approaches (AL, SSL, and the hybrid) can greatly reduce the number of labeled samples, with some cases decreasing the number of labeled samples by more than half.

摘要

过去十年见证了食品科学测量与监测技术的快速发展。在这些技术中，光谱学已被广泛用于食品质量、安全和营养特性的分析。由于食品体系的复杂性以及缺乏全面的预测模型，用于预测食品体系复杂特性的快速且简单的测量方法在很大程度上尚不存在。机器学习（ML）已显示出改善这些特性分类和预测的巨大潜力。然而，为机器学习应用收集大型数据集的障碍仍然存在。在本文中，我们探索数据标注和模型训练的不同方法，以提高机器学习应用的数据效率。具体而言，我们利用主动学习（AL）和半监督学习（SSL）并研究四种方法：基线被动学习、主动学习、半监督学习以及主动学习与半监督学习的混合方法。为了评估这些方法，我们收集了两个光谱学数据集：预测血浆剂量和检测食源性病原体。我们的实验结果表明，与被动学习方法相比，先进方法（主动学习、半监督学习和混合方法）可以大大减少标记样本的数量，在某些情况下标记样本数量减少超过一半。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e397/9257238/2bb2be6b3938/frai-05-863261-g0001.jpg

相似文献

Spectroscopy Approaches for Food Safety Applications: Improving Data Efficiency Using Active Learning and Semi-supervised Learning.用于食品安全应用的光谱学方法：利用主动学习和半监督学习提高数据效率

Front Artif Intell. 2022 Jun 22;5:863261. doi: 10.3389/frai.2022.863261. eCollection 2022.

Comprehensive study of semi-supervised learning for DNA methylation-based supervised classification of central nervous system tumors.基于 DNA 甲基化的中枢神经系统肿瘤有监督分类的半监督学习综合研究。

BMC Bioinformatics. 2022 Jun 8;23(1):223. doi: 10.1186/s12859-022-04764-1.

Multi-class motor imagery EEG classification using collaborative representation-based semi-supervised extreme learning machine.基于协同表示的半监督极限学习机的多类运动想象 EEG 分类。

Med Biol Eng Comput. 2020 Sep;58(9):2119-2130. doi: 10.1007/s11517-020-02227-4. Epub 2020 Jul 16.

Self-Supervised Learning Improves Accuracy and Data Efficiency for IMU-Based Ground Reaction Force Estimation.自监督学习提高了基于惯性测量单元的地面反作用力估计的准确性和数据效率。

bioRxiv. 2024 Jan 25:2023.10.25.564057. doi: 10.1101/2023.10.25.564057.

Bridging the gap with grad: Integrating active learning into semi-supervised domain generalization.弥合鸿沟：将主动学习融入半监督领域泛化。

Neural Netw. 2024 Mar;171:186-199. doi: 10.1016/j.neunet.2023.12.017. Epub 2023 Dec 12.

Semi-supervised classification of radiology images with NoTeacher: A teacher that is not mean.无师自通的放射影像半监督分类：一个不吝啬的老师。

Med Image Anal. 2021 Oct;73:102148. doi: 10.1016/j.media.2021.102148. Epub 2021 Jul 1.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学：基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍

ℓ-norm based safe semi-supervised learning.基于 l-范数的安全半监督学习。

Math Biosci Eng. 2021 Sep 7;18(6):7727-7742. doi: 10.3934/mbe.2021383.

Semi-supervised oblique predictive clustering trees.半监督斜向预测聚类树

PeerJ Comput Sci. 2021 May 3;7:e506. doi: 10.7717/peerj-cs.506. eCollection 2021.

Medical Instrument Segmentation in 3D US by Hybrid Constrained Semi-Supervised Learning.基于混合约束半监督学习的三维超声医学仪器分割。

IEEE J Biomed Health Inform. 2022 Feb;26(2):762-773. doi: 10.1109/JBHI.2021.3101872. Epub 2022 Feb 4.

本文引用的文献

The power of fluorescence excitation-emission matrix (EEM) spectroscopy in the identification and characterization of complex mixtures of fluorescent silver clusters.荧光激发-发射矩阵（EEM）光谱在鉴定和表征荧光银簇复杂混合物方面的能力。

RSC Adv. 2018 Dec 18;8(73):42080-42086. doi: 10.1039/c8ra08751b. eCollection 2018 Dec 12.

Accelerated knowledge discovery from omics data by optimal experimental design.通过实验设计优化加速组学数据的知识发现。

Nat Commun. 2020 Oct 6;11(1):5026. doi: 10.1038/s41467-020-18785-y.

A machine learning workflow for raw food spectroscopic classification in a future industry.未来产业中原始食物光谱分类的机器学习工作流程。

Sci Rep. 2020 Jul 8;10(1):11212. doi: 10.1038/s41598-020-68156-2.

Rapid detection of Escherichia coli using bacteriophage-induced lysis and image analysis.利用噬菌体诱导裂解和图像分析快速检测大肠杆菌。

PLoS One. 2020 Jun 5;15(6):e0233853. doi: 10.1371/journal.pone.0233853. eCollection 2020.

Quantification of bacteria in water using PLS analysis of emission spectra of fluorescence and excitation-emission matrices.使用荧光发射光谱和激发-发射矩阵的 PLS 分析定量水中的细菌。

Water Res. 2020 Feb 1;169:115197. doi: 10.1016/j.watres.2019.115197. Epub 2019 Oct 17.

Semi-Supervised Learning Algorithm for Identifying High-Priority Drug-Drug Interactions Through Adverse Event Reports.通过不良事件报告识别高优先级药物-药物相互作用的半监督学习算法。

IEEE J Biomed Health Inform. 2020 Jan;24(1):57-68. doi: 10.1109/JBHI.2019.2932740. Epub 2019 Aug 2.

Probabilistic Representation and Inverse Design of Metamaterials Based on a Deep Generative Model with Semi-Supervised Learning Strategy.基于半监督学习策略的深度生成模型的超材料概率表示与逆向设计

Adv Mater. 2019 Aug;31(35):e1901111. doi: 10.1002/adma.201901111. Epub 2019 Jul 1.

Detection and Identification of and via Machine Learning Based FTIR Spectroscopy.通过基于机器学习的傅里叶变换红外光谱法检测和识别[具体物质未给出]和[具体物质未给出] 。

Front Microbiol. 2019 Apr 26;10:902. doi: 10.3389/fmicb.2019.00902. eCollection 2019.

Semi-supervised learning of Hidden Markov Models for biological sequence analysis.生物序列分析的隐马尔可夫模型的半监督学习。

Bioinformatics. 2019 Jul 1;35(13):2208-2215. doi: 10.1093/bioinformatics/bty910.

Multi-objective active machine learning rapidly improves structure-activity models and reveals new protein-protein interaction inhibitors.多目标主动机器学习快速改进构效关系模型并揭示新的蛋白质-蛋白质相互作用抑制剂。

Chem Sci. 2016 Jun 1;7(6):3919-3927. doi: 10.1039/c5sc04272k. Epub 2016 Mar 10.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于食品安全应用的光谱学方法：利用主动学习和半监督学习提高数据效率

Spectroscopy Approaches for Food Safety Applications: Improving Data Efficiency Using Active Learning and Semi-supervised Learning.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献