基于抗噪声学习与协方差特征增强的食品图像识别

Food Image Recognition Based on Anti-Noise Learning and Covariance Feature Enhancement.

作者信息

Chen Zengzheng, Chen Hao, Wang Jianxin, Wang Yeru

机构信息

School of Information, Beijing Forestry University, Beijing 100083, China.

Risk Assessment Division 1, China National Center for Food Safety Risk Assessment, Beijing 100022, China.

出版信息

Foods. 2025 Aug 9;14(16):2776. doi: 10.3390/foods14162776.

DOI:10.3390/foods14162776

PMID:40870689

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12385756/

Abstract

Food image recognition is a key research area in food computing, with applications in dietary assessment, menu analysis, and nutrition monitoring. However, imaging devices and environmental factors introduce noise, limiting classification performance. To address this, we propose a food image recognition method based on anti-noise learning and covariance feature enhancement. Specifically, we design a Noise Adaptive Recognition Module (NARM), which incorporates noisy images during training and treats denoising as an auxiliary task to enhance noise invariance and recognition accuracy. To mitigate the adverse effects of noise and strengthen the representation of small eigenvalues, we introduce Eigenvalue-Enhanced Global Covariance Pooling (EGCP) into NARM. Furthermore, we develop a Weighted Multi-Granularity Fusion (WMF) method to improve feature extraction. Combined with the Progressive Temperature-Aware Feature Distillation (PTAFD) strategy, our approach optimizes model efficiency without adding overhead to the backbone. Experimental results demonstrate that our model achieves state-of-the-art performance on the ETH Food-101 and Vireo Food-172 datasets. Specifically, it reaches a Top-1 accuracy of 92.57% on ETH Food-101, outperforming existing methods, and it also delivers strong results in Top-5 on ETH Food-101 and both Top-1 and Top-5 on Vireo Food-172. These findings confirmed the effectiveness and robustness of the proposed approach in real-world food image recognition.

摘要

食品图像识别是食品计算领域的一个关键研究方向，在饮食评估、菜单分析和营养监测等方面有着广泛应用。然而，成像设备和环境因素会引入噪声，从而限制分类性能。为了解决这一问题，我们提出了一种基于抗噪声学习和协方差特征增强的食品图像识别方法。具体而言，我们设计了一个噪声自适应识别模块（NARM），该模块在训练过程中纳入有噪声的图像，并将去噪作为辅助任务，以增强噪声不变性和识别准确率。为了减轻噪声的不利影响并强化小特征值的表示，我们将特征值增强全局协方差池化（EGCP）引入NARM。此外，我们还开发了一种加权多粒度融合（WMF）方法来改进特征提取。结合渐进温度感知特征蒸馏（PTAFD）策略，我们的方法在不增加主干网络开销的情况下优化了模型效率。实验结果表明，我们的模型在ETH Food-101和Vireo Food-172数据集上取得了领先的性能。具体来说，它在ETH Food-101上的Top-1准确率达到了92.57%，超过了现有方法，并且在ETH Food-101的Top-5以及Vireo Food-172的Top-1和Top-5中也都取得了优异的成绩。这些发现证实了所提方法在实际食品图像识别中的有效性和鲁棒性。