Kunze Kyle N, Nwachukwu Benedict U, Cote Mark P, Ramkumar Prem N
Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, U.S.A..
Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, U.S.A.
Arthroscopy. 2025 Mar;41(3):547-556. doi: 10.1016/j.arthro.2024.12.010. Epub 2024 Dec 16.
Large language models (LLMs) are generative artificial intelligence models that create content on the basis of the data on which it was trained. Processing capabilities have evolved from text only to being multimodal including text, images, audio, and video features. In health care settings, LLMs are being applied to several clinically important areas, including patient care and workflow efficiency, communications, hospital operations and data management, medical education, practice management, and health care research. Under the umbrella of patient care, several core use cases of LLMs include simplifying documentation tasks, enhancing patient communication (interactive language and written), conveying medical knowledge, and performing medical triage and diagnosis. However, LLMs warrant scrutiny when applied to health care tasks, as errors may have negative implications for health care outcomes, specifically in the context of perpetuating bias, ethical considerations, and cost-effectiveness. Customized LLMs developed for more narrow purposes may help overcome certain performance limitations, transparency challenges, and biases present in contemporary generalized LLMs by curating training data. Methods of customizing LLMs broadly fall under 4 categories: prompt engineering, retrieval augmented generation, fine-tuning, and agentic augmentation, with each approach conferring different information-retrieval properties for the LLM. LEVEL OF EVIDENCE: Level V, expert opinion.
大语言模型(LLMs)是生成式人工智能模型,它基于所训练的数据来创建内容。其处理能力已从仅支持文本发展到支持多模态,包括文本、图像、音频和视频特征。在医疗保健环境中,大语言模型正被应用于多个临床重要领域,包括患者护理和工作流程效率、沟通、医院运营与数据管理、医学教育、实践管理以及医疗保健研究。在患者护理的范畴内,大语言模型的几个核心用例包括简化文档任务、加强患者沟通(交互式语言和书面沟通)、传授医学知识以及进行医疗分诊和诊断。然而,在将大语言模型应用于医疗保健任务时需要进行审查,因为错误可能会对医疗保健结果产生负面影响,特别是在延续偏见、伦理考量和成本效益方面。为更狭窄目的开发的定制大语言模型可能有助于通过精心挑选训练数据来克服当代通用大语言模型中存在的某些性能限制、透明度挑战和偏见。定制大语言模型的方法大致可分为四类:提示工程、检索增强生成、微调以及智能体增强,每种方法为大语言模型赋予不同的信息检索属性。证据级别:V级,专家意见。