利用大语言模型提高医疗聊天机器人的临床准确性。

Enhancing Clinical Accuracy of Medical Chatbots with Large Language Models.

作者信息

Liu Zhonghua, Quan Yu, Lyu Xiaohong, Alenazi Mohammed J F

出版信息

IEEE J Biomed Health Inform. 2024 Sep 27;PP. doi: 10.1109/JBHI.2024.3470323.

DOI:10.1109/JBHI.2024.3470323

Abstract

The rapid advancement of large language models (LLMs) has opened up new possibilities for transforming healthcare practices, patient interactions, and medical report generation. This paper explores the application of LLMs in developing medical chatbots and virtual assistants that prioritize clinical accuracy. We propose a novel multi-turn dialogue model, including adjusting the position of layer normalization to improve training stability and convergence, employing a contextual sliding window reply prediction task to capture fine-grained local context, and developing a local critical information distillation mechanism to extract and emphasize the most relevant information. These components are integrated into a multi-turn dialogue model that generates coherent and clinically accurate responses. Experiments on the MIMIC-III and n2c2 datasets demonstrate the superiority of the proposed model over state-of-the-art baselines, achieving significant improvements in perplexity, BLEU-2, recall at K scores, medical entity recognition, and response coherence. The proposed model represents a significant step in developing reliable and contextually relevant multi-turn medical dialogue systems that can assist patients and healthcare professionals.

摘要

大语言模型（LLMs）的迅速发展为变革医疗实践、患者互动和医疗报告生成开辟了新的可能性。本文探讨了大语言模型在开发注重临床准确性的医疗聊天机器人和虚拟助手方面的应用。我们提出了一种新颖的多轮对话模型，包括调整层归一化的位置以提高训练稳定性和收敛性，采用上下文滑动窗口回复预测任务来捕捉细粒度的局部上下文，并开发一种局部关键信息提取机制来提取和强调最相关的信息。这些组件被集成到一个多轮对话模型中，该模型能够生成连贯且临床准确的回复。在MIMIC-III和n2c2数据集上的实验证明了所提出模型相对于现有最先进基线的优越性，在困惑度、BLEU-2、K值召回率、医学实体识别和回复连贯性方面取得了显著改进。所提出的模型代表了在开发可靠且上下文相关的多轮医疗对话系统方面迈出的重要一步，该系统可以帮助患者和医疗专业人员。