Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, 50603, Selangor, Malaysia.
Comput Biol Med. 2021 Dec;139:104972. doi: 10.1016/j.compbiomed.2021.104972. Epub 2021 Oct 27.
Food recognition systems recently garnered much research attention in the relevant field due to their ability to obtain objective measurements for dietary intake. This feature contributes to the management of various chronic conditions. Challenges such as inter and intraclass variations alongside the practical applications of smart glasses, wearable cameras, and mobile devices require resource-efficient food recognition models with high classification performance. Furthermore, explainable AI is also crucial in health-related domains as it characterizes model performance, enhancing its transparency and objectivity. Our proposed architecture attempts to address these challenges by drawing on the strengths of the transfer learning technique upon initializing MobiletNetV3 with weights from a pre-trained model of ImageNet. The MobileNetV3 achieves superior performance using the squeeze and excitation strategy, providing unequal weight to different input channels and contrasting equal weights in other variants. Despite being fast and efficient, there is a high possibility for it to be stuck in the local optima like other deep neural networks, reducing the desired classification performance of the model. Thus, we overcome this issue by applying the snapshot ensemble approach as it enables the M model in a single training process without any increase in the required training time. As a result, each snapshot in the ensemble visits different local minima before converging to the final solution which enhances recognition performance. On overcoming the challenge of explainability, we argue that explanations cannot be monolithic, since each stakeholder perceive the results', explanations based on different objectives and aims. Thus, we proposed a user-centered explainable artificial intelligence (AI) framework to increase the trust of the involved parties by inferencing and rationalizing the results according to needs and user profile. Our framework is comprehensive in terms of a dietary assessment app as it detects Food/Non-Food, food categories, and ingredients. Experimental results on the standard food benchmarks and newly contributed Malaysian food dataset for ingredient detection demonstrated superior performance on an integrated set of measures over other methodologies.
食品识别系统由于能够对饮食摄入量进行客观测量,因此最近在相关领域引起了广泛关注。这一特性有助于管理各种慢性疾病。智能眼镜、可穿戴相机和移动设备的实际应用带来了跨类和类内变化等挑战,因此需要具有高分类性能的资源高效型食品识别模型。此外,可解释人工智能在健康相关领域也至关重要,因为它可以描述模型性能,提高其透明度和客观性。我们提出的架构试图通过在初始化 MobiletNetV3 时使用来自预训练的 ImageNet 模型的权重来利用迁移学习技术的优势来解决这些挑战。MobileNetV3 使用挤压和激励策略实现了卓越的性能,为不同的输入通道提供了不同的权重,并与其他变体中的相等权重形成对比。尽管快速高效,但它像其他深度神经网络一样,很有可能陷入局部最优,从而降低模型所需的分类性能。因此,我们通过应用快照集成方法来克服这个问题,因为它可以在单个训练过程中使用 M 模型,而无需增加所需的训练时间。结果,集成中的每个快照在收敛到最终解决方案之前都会访问不同的局部最小值,从而提高识别性能。在克服可解释性挑战方面,我们认为解释不能是单一的,因为每个利益相关者都基于不同的目标和目的来感知结果的解释。因此,我们提出了一个以用户为中心的可解释人工智能 (AI) 框架,通过根据需要和用户资料对结果进行推理和合理化,来提高相关方的信任度。我们的框架在饮食评估应用方面是全面的,因为它可以检测食物/非食物、食物类别和成分。在标准食品基准和新贡献的马来西亚食品数据集上进行的成分检测实验结果表明,在综合评估指标上,我们的方法优于其他方法。