Suppr超能文献

评估用于口腔颌面放射学患者教育的人工智能聊天机器人。

Evaluating artificial intelligence chatbots for patient education in oral and maxillofacial radiology.

作者信息

Helvacioglu-Yigit Dilek, Demirturk Husniye, Ali Kamran, Tamimi Dania, Koenig Lisa, Almashraqi Abeer

机构信息

College of Dental Medicine, QU Health, Qatar University, Doha, Qatar.

University of Pittsburgh School of Dental Medicine, Pittsburgh, PA, USA; Oral and Maxillofacial Radiology Consultant, Private Practice, Wexford, PA, USA.

出版信息

Oral Surg Oral Med Oral Pathol Oral Radiol. 2025 Jun;139(6):750-759. doi: 10.1016/j.oooo.2025.01.001. Epub 2025 Jan 11.

Abstract

OBJECTIVE

This study aimed to compare the quality and readability of the responses generated by 3 publicly available artificial intelligence (AI) chatbots in answering frequently asked questions (FAQs) related to Oral and Maxillofacial Radiology (OMR) to assess their suitability for patient education.

STUDY DESIGN

Fifteen OMR-related questions were selected from professional patient information websites. These questions were posed to ChatGPT-3.5 by OpenAI, Gemini 1.5 Pro by Google, and Copilot by Microsoft to generate responses. Three board-certified OMR specialists evaluated the responses regarding scientific adequacy, ease of understanding, and overall reader satisfaction. Readability was assessed using the Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE) scores. The Wilcoxon signed-rank test was conducted to compare the scores assigned by the evaluators to the responses from the chatbots and professional websites. Interevaluator agreement was examined by calculating the Fleiss kappa coefficient.

RESULTS

There were no significant differences between groups in terms of scientific adequacy. In terms of readability, chatbots had overall mean FKGL and FRE scores of 12.97 and 34.11, respectively. Interevaluator agreement level was generally high.

CONCLUSIONS

Although chatbots are relatively good at responding to FAQs, validating AI-generated information using input from healthcare professionals can enhance patient care and safety. Readability of the text content in the chatbots and websites requires high reading levels.

摘要

目的

本研究旨在比较3个公开可用的人工智能(AI)聊天机器人在回答与口腔颌面放射学(OMR)相关的常见问题(FAQ)时所生成回复的质量和可读性,以评估它们在患者教育方面的适用性。

研究设计

从专业患者信息网站中选取了15个与OMR相关的问题。将这些问题分别抛给OpenAI的ChatGPT-3.5、谷歌的Gemini 1.5 Pro和微软的Copilot,以生成回复。3位获得委员会认证的OMR专家对回复的科学充分性、易理解性和总体读者满意度进行了评估。使用弗莱什-金凯德年级水平(FKGL)和弗莱什阅读简易度(FRE)分数来评估可读性。进行威尔科克森符号秩检验,以比较评估者对聊天机器人和专业网站回复所给出的分数。通过计算弗莱iss卡帕系数来检验评估者间的一致性。

结果

在科学充分性方面,各组之间没有显著差异。在可读性方面,聊天机器人的总体平均FKGL和FRE分数分别为12.97和34.11。评估者间的一致性水平普遍较高。

结论

尽管聊天机器人在回答常见问题方面相对较好,但利用医疗专业人员的输入来验证人工智能生成的信息可以提高患者护理质量和安全性。聊天机器人和网站中文本内容的可读性要求较高的阅读水平。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验