Sawamura Shogo, Bito Takanobu, Ando Takahiro, Masuda Kento, Kameyama Sakiko, Ishida Hiroyasu
Department of Rehabilitation, Heisei College of Health Sciences: 180 Kurono, Gifu City, Gifu 501-1131, Japan.
Department of Rehabilitation, Gifu University Hospital, Japan.
J Phys Ther Sci. 2024 May;36(5):234-239. doi: 10.1589/jpts.36.234. Epub 2024 May 1.
[Purpose] This study evaluated the accuracy of ChatGPT's responses to and references for five clinical questions in physical therapy based on the and assessed this language model's potential as a tool for supporting clinical decision-making in the rehabilitation field. [Participants and Methods] Five clinical questions from the "Stroke", "Musculoskeletal disorders", and "Internal disorders" sections of the , released by the Japanese Society of Physical Therapy, were presented to ChatGPT. ChatGPT was instructed to provide responses in Japanese accompanied by references such as PubMed IDs or digital object identifiers. The accuracy of the generated content and references was evaluated by two assessors with expertise in their respective sections by using a 4-point scale, and comments were provided for point deductions. The inter-rater agreement was evaluated using weighted kappa coefficients. [Results] ChatGPT demonstrated adequate accuracy in generating content for clinical questions in physical therapy. However, the accuracy of the references was poor, with a significant number of references being non-existent or misinterpreted. [Conclusion] ChatGPT has limitations in reference selection and reliability. While ChatGPT can offer accurate responses to clinical questions in physical therapy, it should be used with caution because it is not a completely reliable model.
[目的] 本研究基于《物理治疗临床实践指南》评估了ChatGPT对物理治疗中五个临床问题的回答及参考文献的准确性,并评估了该语言模型作为支持康复领域临床决策工具的潜力。[参与者与方法] 向ChatGPT提出了日本物理治疗学会发布的《物理治疗临床实践指南》中“中风”“肌肉骨骼疾病”和“内科疾病”部分的五个临床问题。要求ChatGPT用日语提供回答,并附上诸如PubMed ID或数字对象标识符等参考文献。由两名在各自领域具有专业知识的评估人员使用4分制对生成内容和参考文献的准确性进行评估,并对扣分情况给出评论。使用加权kappa系数评估评分者间的一致性。[结果] ChatGPT在生成物理治疗临床问题的内容方面表现出足够的准确性。然而,参考文献的准确性较差,大量参考文献不存在或被错误解读。[结论] ChatGPT在参考文献选择和可靠性方面存在局限性。虽然ChatGPT可以对物理治疗中的临床问题提供准确回答,但由于它不是一个完全可靠的模型,应谨慎使用。