Suppr超能文献

利用微调语言模型加速 NOVA 食品加工水平分类:一项多国家研究。

Accelerating the Classification of NOVA Food Processing Levels Using a Fine-Tuned Language Model: A Multi-Country Study.

机构信息

Department of Nutritional Sciences, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5S 1A1, Canada.

Fundación Interamericana del Corazón Argentina, Buenos Aires C1425, Argentina.

出版信息

Nutrients. 2023 Sep 27;15(19):4167. doi: 10.3390/nu15194167.

Abstract

The consumption and availability of ultra-processed foods (UPFs), which are associated with an increased risk of noncommunicable diseases, have increased in most countries. While many countries have or are planning to incorporate UPF recommendations in their national dietary guidelines, the classification of food processing levels relies on expertise-based manual categorization, which is labor-intensive and time-consuming. Our study utilized transformer-based language models to automate the classification of food processing levels according to the NOVA classification system in the Canada, Argentina, and US national food databases. We showed that fine-tuned language models using the ingredient list text found on food labels as inputs achieved a high overall accuracy (F1 score of 0.979) in predicting the food processing levels of Canadian food products, outperforming traditional machine learning models using structured nutrient data and bag-of-words. Most of the food categories reached a prediction accuracy of 0.98 using a fined-tuned language model, especially for predicting processed foods and ultra-processed foods. Our automation strategy was also effective and generalizable for classifying food products in the Argentina and US databases, providing a cost-effective approach for policymakers to monitor and regulate the UPFs in the global food supply.

摘要

在大多数国家,超加工食品(UPFs)的消费和供应都有所增加,而 UPFs 与非传染性疾病风险的增加有关。虽然许多国家已经或计划在国家饮食指南中纳入 UPF 建议,但食品加工水平的分类依赖于基于专业知识的手动分类,这既耗费人力又耗时。我们的研究利用基于转换器的语言模型,根据 NOVA 分类系统,对加拿大、阿根廷和美国国家食品数据库中的食品进行自动分类。我们表明,使用食品标签上的成分列表文本作为输入进行微调的语言模型,在预测加拿大食品的食品加工水平方面取得了很高的总体准确性(F1 得分为 0.979),优于使用结构化营养数据和词袋的传统机器学习模型。大多数食品类别使用经过微调的语言模型都达到了 0.98 的预测准确性,尤其是在预测加工食品和超加工食品方面。我们的自动化策略对于分类阿根廷和美国数据库中的食品产品也同样有效且具有通用性,为政策制定者提供了一种具有成本效益的方法,以监测和规范全球食品供应中的 UPFs。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验