Ulster University, Jordanstown Campus, School of Computing, Northern Ireland, United Kingdom.
Ulster University, Jordanstown Campus, School of Communication and Media, Northern Ireland, United Kingdom.
Comput Biol Med. 2018 Apr 1;95:217-233. doi: 10.1016/j.compbiomed.2018.02.008. Epub 2018 Feb 17.
Obesity is increasing worldwide and can cause many chronic conditions such as type-2 diabetes, heart disease, sleep apnea, and some cancers. Monitoring dietary intake through food logging is a key method to maintain a healthy lifestyle to prevent and manage obesity. Computer vision methods have been applied to food logging to automate image classification for monitoring dietary intake. In this work we applied pretrained ResNet-152 and GoogleNet convolutional neural networks (CNNs), initially trained using ImageNet Large Scale Visual Recognition Challenge (ILSVRC) dataset with MatConvNet package, to extract features from food image datasets; Food 5K, Food-11, RawFooT-DB, and Food-101. Deep features were extracted from CNNs and used to train machine learning classifiers including artificial neural network (ANN), support vector machine (SVM), Random Forest, and Naive Bayes. Results show that using ResNet-152 deep features with SVM with RBF kernel can accurately detect food items with 99.4% accuracy using Food-5K validation food image dataset and 98.8% with Food-5K evaluation dataset using ANN, SVM-RBF, and Random Forest classifiers. Trained with ResNet-152 features, ANN can achieve 91.34%, 99.28% when applied to Food-11 and RawFooT-DB food image datasets respectively and SVM with RBF kernel can achieve 64.98% with Food-101 image dataset. From this research it is clear that using deep CNN features can be used efficiently for diverse food item image classification. The work presented in this research shows that pretrained ResNet-152 features provide sufficient generalisation power when applied to a range of food image classification tasks.
肥胖症在全球范围内呈上升趋势,可导致多种慢性疾病,如 2 型糖尿病、心脏病、睡眠呼吸暂停和某些癌症。通过食物记录来监测饮食摄入是保持健康生活方式、预防和管理肥胖的关键方法。计算机视觉方法已应用于食物记录,以实现图像分类的自动化,从而监测饮食摄入。在这项工作中,我们应用了预先训练的 ResNet-152 和 GoogleNet 卷积神经网络(CNN),这些网络最初使用 ImageNet Large Scale Visual Recognition Challenge(ILSVRC)数据集和 MatConvNet 包进行训练,从食物图像数据集(Food 5K、Food-11、RawFooT-DB 和 Food-101)中提取特征。从 CNN 中提取深度特征,并用于训练机器学习分类器,包括人工神经网络(ANN)、支持向量机(SVM)、随机森林和朴素贝叶斯。结果表明,使用 ResNet-152 深度特征和 SVM 核 RBF 可以准确地检测到 Food-5K 验证食物图像数据集的 99.4%和 Food-5K 评估数据集的 98.8%的食物项目,使用的是 ANN、SVM-RBF 和随机森林分类器。使用 ResNet-152 特征训练的 ANN 可以分别达到 91.34%和 99.28%,应用于 Food-11 和 RawFooT-DB 食物图像数据集,而 SVM 核 RBF 可以达到 64.98%,应用于 Food-101 图像数据集。从这项研究中可以清楚地看出,使用深度 CNN 特征可以有效地用于各种食物项目图像分类。本研究中的工作表明,应用于一系列食物图像分类任务时,预先训练的 ResNet-152 特征提供了足够的泛化能力。