Yu Yunguo, Gomez-Cabello Cesar A, Makarova Svetlana, Parte Yogesh, Borna Sahar, Haider Syed Ali, Genovese Ariana, Prabha Srinivasagam, Forte Antonio J
Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA.
Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Road, Jacksonville, FL 32224, USA.
Bioengineering (Basel). 2024 Dec 28;12(1):17. doi: 10.3390/bioengineering12010017.
Current clinical care relies heavily on complex, rule-based systems for tasks like diagnosis and treatment. However, these systems can be cumbersome and require constant updates. This study explores the potential of the large language model (LLM), LLaMA 2, to address these limitations. We tested LLaMA 2's performance in interpreting complex clinical process models, such as Mayo Clinic Care Pathway Models (CPMs), and providing accurate clinical recommendations. LLM was trained on encoded pathways versions using DOT language, embedding them with SentenceTransformer, and then presented with hypothetical patient cases. We compared the token-level accuracy between LLM output and the ground truth by measuring both node and edge accuracy. LLaMA 2 accurately retrieved the diagnosis, suggested further evaluation, and delivered appropriate management steps, all based on the pathways. The average node accuracy across the different pathways was 0.91 (SD ± 0.045), while the average edge accuracy was 0.92 (SD ± 0.122). This study highlights the potential of LLMs for healthcare information retrieval, especially when relevant data are provided. Future research should focus on improving these models' interpretability and their integration into existing clinical workflows.
当前的临床护理在诊断和治疗等任务中严重依赖复杂的、基于规则的系统。然而,这些系统可能很繁琐,需要不断更新。本研究探讨了大语言模型(LLM)LLaMA 2解决这些局限性的潜力。我们测试了LLaMA 2在解释复杂临床过程模型(如梅奥诊所护理路径模型(CPM))以及提供准确临床建议方面的性能。LLM使用DOT语言在编码路径版本上进行训练,用SentenceTransformer对其进行嵌入,然后呈现假设的患者病例。我们通过测量节点和边的准确性来比较LLM输出与真实情况之间的令牌级准确性。LLaMA 2能够准确检索诊断结果,建议进一步评估,并基于路径提供适当的管理步骤。不同路径的平均节点准确性为0.91(标准差±0.045),而平均边准确性为0.92(标准差±0.122)。本研究突出了大语言模型在医疗信息检索方面的潜力,尤其是在提供相关数据时。未来的研究应专注于提高这些模型的可解释性及其与现有临床工作流程的整合。