Gezer Mehmet Can, Armangil Mehmet
Department of Orthopedics and Traumatology, Mamak State Hospital, Ankara-Türkiye.
Department of Orthopedics and Traumatology, Hand Surgery Unit, Ankara University Faculty of Medicine, Ankara-Türkiye.
Ulus Travma Acil Cerrahi Derg. 2025 Apr;31(4):389-393. doi: 10.14744/tjtes.2025.32735.
This study aims to evaluate the accuracy and reliability of Generative Pre-trained Transformer (ChatGPT; OpenAI, San Francisco, California) in answering patient-related questions about trigger finger. This evaluation has the potential to enhance patient education prior to treatment and provides insight into the role of artificial intelligence (AI)-based systems in the patient educa-tion process.
The ten most frequently asked questions regarding trigger finger were compiled from patient education websites and a literature review, then posed to ChatGPT. Two orthopedic specialists evaluated the responses using the Journal of the American Medical Association (JAMA) Benchmark criteria and the DISCERN instrument (A Tool for Judging the Quality of Written Consumer Health Information on Treatment Choices). Additionally, the readability of the responses was assessed using the Flesch-Kincaid Grade Level.
The DISCERN scores for ChatGPT's responses to trigger finger questions ranged from 35 to 47, with an average of 42, indicating "moderate" quality. While 60% of the responses were satisfactory, 40% contained deficiencies. According to the JAMA Benchmark criteria, the absence of scientific references was a significant drawback. The average readability level corresponded to the university level, making the information difficult to understand for patients with low health literacy. Improvements are needed to enhance the accessibility and comprehensibility of the content for a broader patient population.
To the best of our knowledge, this is the first study to investigate the use of ChatGPT in the context of trigger finger. While ChatGPT shows reasonable effectiveness in providing general information on trigger finger, expert oversight is necessary before it can be relied upon as a primary source for patient education.
本研究旨在评估生成式预训练变换器(ChatGPT;OpenAI,加利福尼亚州旧金山)在回答有关扳机指的患者相关问题时的准确性和可靠性。该评估有可能在治疗前加强患者教育,并深入了解基于人工智能(AI)的系统在患者教育过程中的作用。
从患者教育网站和文献综述中整理出关于扳机指的十个最常见问题,然后向ChatGPT提问。两名骨科专家使用《美国医学会杂志》(JAMA)基准标准和DISCERN工具(一种判断书面消费者健康信息治疗选择质量的工具)对回答进行评估。此外,使用弗莱什-金凯德年级水平评估回答的可读性。
ChatGPT对扳机指问题回答的DISCERN分数在35至47之间,平均为42,表明质量为“中等”。虽然60%的回答令人满意,但40%存在缺陷。根据JAMA基准标准,缺乏科学参考文献是一个重大缺陷。平均可读性水平相当于大学水平,这使得健康素养较低的患者难以理解这些信息。需要改进以提高内容对更广泛患者群体的可及性和可理解性。
据我们所知,这是第一项在扳机指背景下研究ChatGPT使用情况的研究。虽然ChatGPT在提供关于扳机指的一般信息方面显示出合理的有效性,但在将其作为患者教育的主要来源之前,需要专家监督。