Gaitán-Guerrero Juan F, Martínez-Cruz Carmen, Espinilla Macarena, Díaz-Jiménez David, López Jose L
Department of Computer Science, University of Jaén, Jaén, 23071, Spain.
Department of Languages and Computer Systems, University of Granada, E.T.S. de Ingenierías Informática y de Telecomunicación, Granada, 18071, Spain.
Comput Methods Programs Biomed. 2025 Sep;269:108878. doi: 10.1016/j.cmpb.2025.108878. Epub 2025 Jun 9.
Diabetes is a global health concern, affecting millions of adults worldwide and exhibiting a growing prevalence. Managing the disease highly relies on continuous glucose monitoring, yet the dense and complex nature of electronic devices data streams poses significant challenges for efficient interpretation. Large Language Models are being widely applied across different domains for their ability to generate human-like text, but still fall short in producing accurate and meaningful text from raw data. To address this limitation, this study proposes a fine-tuning methodology tailored specifically to glucose data, but scalable to other expert-guided domains, enabling the models to generate concise, relevant and safe summaries, bridging the gap between raw data and efficient medical attention.
This study introduces a novel continuous glucose monitoring framework that involves fine-tuned GPT models using structured datasets generated through an expert-guided data modeling based on Fuzzy Logic and prompt engineering for task contextualization. A new evaluation methodology is defined to assess the performance of the Large Language Models across different critical domains where expert knowledge is fundamental to characterize temporally dependent data and ensure valuable insights.
Fine-tuned GPT-4o achieved the highest performance, with an average score of 96% across all metrics. GPT-4o-mini followed with 76% score, while GPT-3.5 scored 72%. The use of fuzzy knowledge-based prompts proved more effective in scenarios with full data availability, or in scenarios with a simplified data availability when the models are not fine-tuned; domain-guided prompts improved output relevance and stability in fine-tuned models with less data availability.
These results indicate the capability of our methods to align Large Language Models with the task of generating human-like text from raw data, highlighting their potential to manage diabetes by complex glucose patterns interpretation, alleviating the burden on healthcare systems.
糖尿病是一个全球关注的健康问题,影响着全球数百万成年人,且患病率呈上升趋势。糖尿病的管理高度依赖持续血糖监测,然而电子设备数据流的密集和复杂性质给高效解读带来了重大挑战。大语言模型因其能够生成类人文本而在不同领域得到广泛应用,但在从原始数据生成准确且有意义的文本方面仍存在不足。为解决这一局限性,本研究提出一种专门针对血糖数据的微调方法,但该方法可扩展到其他专家指导领域,使模型能够生成简洁、相关且安全的摘要,弥合原始数据与高效医疗关注之间的差距。
本研究引入了一种新颖的持续血糖监测框架,该框架涉及使用基于模糊逻辑的专家指导数据建模生成的结构化数据集对GPT模型进行微调,并通过提示工程实现任务情境化。定义了一种新的评估方法,以评估大语言模型在不同关键领域的性能,在这些领域中,专家知识对于表征时间相关数据和确保有价值的见解至关重要。
微调后的GPT-4o表现最佳,所有指标的平均得分达96%。GPT-4o-mini其次,得分为76%,而GPT-3.5得分为72%。基于模糊知识的提示在数据完整可用的场景中,或在模型未微调时数据可用性简化的场景中被证明更有效;领域引导提示在数据可用性较低的微调模型中提高了输出的相关性和稳定性。
这些结果表明我们的方法有能力使大语言模型与从原始数据生成类人文本的任务保持一致,突出了它们通过解读复杂血糖模式来管理糖尿病、减轻医疗系统负担的潜力。