Kopitar Leon, Bedrač Leon, Strath Larissa J, Bian Jiang, Stiglic Gregor
Faculty of Health Sciences, University of Maribor, 2000 Maribor, Slovenia.
Faculty of Electrical Engineering and Computer Science, University of Maribor, 2000 Maribor, Slovenia.
Nutrients. 2025 Apr 29;17(9):1492. doi: 10.3390/nu17091492.
BACKGROUND/OBJECTIVES: Identifying and decomposing compound ingredients within meal plans presents meal customization and nutritional analysis challenges. It is essential for accurately identifying and replacing problematic ingredients linked to allergies or intolerances and helping nutritional evaluation. METHODS: This study explored the effectiveness of three large language models (LLMs)-GPT-4o, Llama-3 (70B), and Mixtral (8x7B), in decomposing compound ingredients into basic ingredients within meal plans. GPT-4o was used to generate 15 structured meal plans, each containing compound ingredients. Each LLM then identified and decomposed these compound items into basic ingredients. The decomposed ingredients were matched to entries in a subset of the USDA FoodData Central repository using API-based search and mapping techniques. Nutritional values were retrieved and aggregated to evaluate accuracy of decomposition. Performance was assessed through manual review by nutritionists and quantified using accuracy and F1-score. Statistical significance was tested using paired -tests or Wilcoxon signed-rank tests based on normality. RESULTS: Results showed that large models-both Llama-3 (70B) and GPT-4o-outperformed Mixtral (8x7B), achieving average F1-scores of 0.894 (95% CI: 0.84-0.95) and 0.842 (95% CI: 0.79-0.89), respectively, compared to an F1-score of 0.690 (95% CI: 0.62-0.76) from Mixtral (8x7B). CONCLUSIONS: The open-source Llama-3 (70B) model achieved the best performance, outperforming the commercial GPT-4o model, showing its superior ability to consistently break down compound ingredients into precise quantities within meal plans and illustrating its potential to enhance meal customization and nutritional analysis. These findings underscore the potential role of advanced LLMs in precision nutrition and their application in promoting healthier dietary practices tailored to individual preferences and needs.
背景/目的:识别和分解饮食计划中的复合成分给饮食定制和营养分析带来了挑战。准确识别和替换与过敏或不耐受相关的有问题成分并辅助营养评估至关重要。 方法:本研究探讨了三种大语言模型(LLMs)——GPT-4o、Llama-3(70B)和Mixtral(8x7B)在将饮食计划中的复合成分分解为基本成分方面的有效性。使用GPT-4o生成15个结构化饮食计划,每个计划都包含复合成分。然后,每个大语言模型将这些复合项目识别并分解为基本成分。使用基于应用程序编程接口(API)的搜索和映射技术,将分解后的成分与美国农业部食品数据中心存储库子集中的条目进行匹配。检索并汇总营养价值以评估分解的准确性。通过营养师的人工审核评估性能,并使用准确率和F1分数进行量化。根据数据的正态性,使用配对t检验或Wilcoxon符号秩检验来检验统计显著性。 结果:结果表明,大型模型——Llama-3(70B)和GPT-4o——的表现优于Mixtral(8x7B),Llama-3(70B)和GPT-4o的平均F1分数分别为0.894(95%置信区间:0.84 - 0.95)和0.842(95%置信区间:0.79 - 0.89),而Mixtral(8x7B)的F1分数为0.690(95%置信区间:0.62 - 0.76)。 结论:开源的Llama-3(70B)模型表现最佳,优于商业GPT-4o模型,显示出其在饮食计划中始终如一地将复合成分分解为精确数量的卓越能力,并说明了其在增强饮食定制和营养分析方面的潜力。这些发现强调了先进大语言模型在精准营养中的潜在作用及其在促进根据个人偏好和需求定制更健康饮食习惯方面的应用。
J Allergy Clin Immunol. 2025-2-14
BMJ Health Care Inform. 2025-2-24
Int J Med Inform. 2025-7
J Med Internet Res. 2024-11-28
Sensors (Basel). 2020-7-31