Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, New York, New York, USA.
Research Division and Department of Orthopedic Surgery, Hospital for Special Surgery, New York, New York, USA.
J Orthop Res. 2024 Jun;42(6):1276-1282. doi: 10.1002/jor.25782. Epub 2024 Jan 21.
Large language model (LLM) chatbots possess a remarkable capacity to synthesize complex information into concise, digestible summaries across a wide range of orthopedic subject matter. As LLM chatbots become widely available they will serve as a powerful, accessible resource that patients, clinicians, and researchers may reference to obtain information about orthopedic science and clinical management. Here, we examined the performance of three well-known and easily accessible chatbots-ChatGPT, Bard, and Bing AI-in responding to inquiries relating to clinical management and orthopedic concepts. Although all three chatbots were found to be capable of generating relevant responses, ChatGPT outperformed Bard and BingAI in each category due to its ability to provide accurate and complete responses to orthopedic queries. Despite their promising applications in clinical management, shortcomings observed included incomplete responses, lack of context, and outdated information. Nonetheless, the ability for these LLM chatbots to address these inquires has largely yet to be evaluated and will be critical for understanding the risks and opportunities of LLM chatbots in orthopedics.
大型语言模型(LLM)聊天机器人具有卓越的能力,可以将复杂的信息综合成简洁、易于理解的摘要,涵盖广泛的骨科主题。随着 LLM 聊天机器人的广泛应用,它们将成为患者、临床医生和研究人员获取骨科科学和临床管理信息的强大、可及的资源。在这里,我们研究了三个知名且易于访问的聊天机器人——ChatGPT、Bard 和 Bing AI——在回答与临床管理和骨科概念相关的询问时的表现。虽然所有三个聊天机器人都能够生成相关的回复,但 ChatGPT 在每个类别中的表现都优于 Bard 和 Bing AI,因为它能够为骨科查询提供准确和完整的回复。尽管它们在临床管理中有很有前景的应用,但观察到的缺点包括不完整的回复、缺乏上下文和过时的信息。尽管如此,这些 LLM 聊天机器人回答这些询问的能力在很大程度上尚未得到评估,这对于理解 LLM 聊天机器人在骨科中的风险和机遇至关重要。