Ma Peihua, Lau Chun Pong, Yu Ning, Li An, Liu Ping, Wang Qin, Sheng Jiping
School of Agricultural Economics and Rural Development, Renmin University of China, Beijing 100872, China; Department of Nutrition and Food Science, College of Agriculture and Natural Resources, University of Maryland, College Park, MD 20740, United States.
Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD 21218, United States.
Food Res Int. 2021 Sep;147:110437. doi: 10.1016/j.foodres.2021.110437. Epub 2021 May 24.
Food image recognition systems facilitate dietary assessment and in turn track users' dietary behaviors. However, due to the diversity of Chinese food, a quick and accurate food image recognizing is a particularly challenging task. The success of deep learning in computer vision inspired us to investigate its potential in this task. To satisfy its requirement on large-scale data, we established the first open-access image database for Chinese dishes, named ChinaFood-100, with quantitative nutrient annotations. We collected 10,074 images covering 100 food categories, including staple, meat, seafood, and vegetables. Based on this dataset, we trained four state-of-art deep learning neural network architectures for image recognition and showed that deep learning model Inception V3 resulted in the most advantageous recognition performance 78.26% in top-1 accuracy and 96.62% in top-5 accuracy. Based on this image recognition posterior, we further compared three nutrition estimation algorithms for food nutrient estimation. The results showed that the top-5 Arithmetic Mean (AM) algorithm achieved the highest regression coefficient (R) up to 0.73 for protein estimation, which validated its applicability in practice. In addition, we analyzed our algorithm in terms of precision-recall and Grad-CAM. The results achieved by deep learning for food nutrient estimation may encourage artificial intelligence to be applied to the field of food, which shed the light on improvement in the future.
食物图像识别系统有助于进行饮食评估,进而追踪用户的饮食行为。然而,由于中国食物的多样性,快速准确地识别食物图像是一项特别具有挑战性的任务。深度学习在计算机视觉领域的成功启发我们研究其在这项任务中的潜力。为了满足其对大规模数据的需求,我们建立了第一个用于中国菜肴的开放获取图像数据库,名为ChinaFood-100,并带有定量营养注释。我们收集了10074张涵盖100种食物类别的图像,包括主食、肉类、海鲜和蔬菜。基于这个数据集,我们训练了四种用于图像识别的先进深度学习神经网络架构,并表明深度学习模型Inception V3在top-1准确率方面达到了最有利的识别性能,为78.26%,在top-5准确率方面为96.62%。基于这种图像识别结果,我们进一步比较了三种用于食物营养估计的营养估计算法。结果表明,top-5算术平均值(AM)算法在蛋白质估计方面达到了最高回归系数(R),高达0.73,这验证了其在实际中的适用性。此外,我们从精确率-召回率和Grad-CAM方面分析了我们的算法。深度学习在食物营养估计方面取得的结果可能会鼓励人工智能应用于食品领域,这为未来的改进提供了思路。