Jagadesh B N, Mantena Srihari Varma, Sathe Asha P, Prabhakara Rao T, Lella Kranthi Kumar, Pabboju Shyam Sunder, Vatambeti Ramesh
School of Computer Science and Engineering, VIT-AP University, Vijayawada, India.
Department of Computer Science and Engineering, SRKR Engineering College, Bhimavaram, 534204, India.
Sci Rep. 2025 Feb 15;15(1):5591. doi: 10.1038/s41598-025-90244-4.
This study presents a robust approach for continuous food recognition essential for nutritional research, leveraging advanced computer vision techniques. The proposed method integrates Mutually Guided Image Filtering (MuGIF) to enhance dataset quality and minimize noise, followed by feature extraction using the Visual Geometry Group (VGG) architecture for intricate visual analysis. A hybrid transformer model, combining Vision Transformer and Swin Transformer variants, is introduced to capitalize on their complementary strengths. Hyperparameter optimization is performed using the Improved Discrete Bat Algorithm (IDBA), resulting in a highly accurate and efficient classification system. Experimental results highlight the superior performance of the proposed model, achieving a classification accuracy of 99.83%, significantly outperforming existing methods. This study underscores the potential of hybrid transformer architectures and advanced preprocessing techniques in advancing food recognition systems, offering enhanced accuracy and efficiency for practical applications in dietary monitoring and personalized nutrition recommendations.
本研究提出了一种强大的方法,用于营养研究中至关重要的连续食物识别,该方法利用了先进的计算机视觉技术。所提出的方法集成了相互引导图像滤波(MuGIF)以提高数据集质量并最小化噪声,随后使用视觉几何组(VGG)架构进行特征提取以进行复杂的视觉分析。引入了一种结合视觉Transformer和Swin Transformer变体的混合Transformer模型,以利用它们的互补优势。使用改进的离散蝙蝠算法(IDBA)进行超参数优化,从而得到一个高度准确且高效的分类系统。实验结果突出了所提出模型的卓越性能,实现了99.83%的分类准确率,显著优于现有方法。本研究强调了混合Transformer架构和先进预处理技术在推进食物识别系统方面的潜力,为饮食监测和个性化营养推荐的实际应用提供了更高的准确性和效率。