Resident, Department of Oral and Maxillofacial Surgery, Rutgers School of Dental Medicine, Newark, NJ.
Candidate, Rutgers New Jersey Medical School, Newark, NJ.
J Oral Maxillofac Surg. 2024 Oct;82(10):1239-1245. doi: 10.1016/j.joms.2024.06.177. Epub 2024 Jul 2.
Artificial intelligence (AI) platforms such as Chat Generative Pre-Trained Transformer (ChatGPT) (Open AI, San Francisco, California, USA) have the capacity to answer health-related questions. It remains unknown whether AI can be a patient-friendly and accurate resource regarding third molar extraction.
The purpose was to determine the accuracy and readability of AI responses to common patient questions regarding third molar extraction.
STUDY DESIGN, SETTING, SAMPLE: This is a cross sectional in-silico assessment of readability and soundness of a computer-generated report.
Not applicable.
Accuracy, or the ability to provide clinically correct and relevant information, was determined subjectively by 2 reviewers using a 5-point Likert scale, and objectively by comparing responses to American Association of Oral and Maxillofacial Surgeons (AAOMS) clinical consensus papers. Readability, or how easy a piece of text is to read, was assessed using the Flesch Kincaid Reading Ease (FKRE) and Flesch Kincaid Grade Level (FKGL). Both assess readability based on mean number of syllables per word, and words per sentence. To be deemed readable, FKRE should be >60 and FKGL should be <8.
Not applicable.
Descriptive statistics were used to analyze the findings of this study.
AI-generated responses above the recommended level for the average patient (FKRE: 52; FKGL: 10). The average Likert score was 4.36, suggesting that most responses were accurate with minor inaccuracies or missing information. AI correctly deferred to the provider in instances where no definitive answer exists. Of the responses that addressed content in AAOMS consensus papers, 18/19 responses closely aligned with them. All prompts did not provide citations or references.
AI was able to provide mostly accurate responses, and content was closely aligned with AAOMS guidelines. However, responses were too complex for the average third molar extraction patient, and were deficient in citations and references. It is important for providers to educate patients on the utility of AI, and to decide whether to recommend using it for information. Ultimately, the best resource for answers is from the practitioners themselves because the AI platform lacks clinical experience.
人工智能(AI)平台,如 Chat Generative Pre-Trained Transformer(ChatGPT)(美国加利福尼亚州旧金山的 Open AI),有能力回答与健康相关的问题。目前尚不清楚 AI 是否可以成为一种针对第三磨牙拔除的患者友好且准确的资源。
旨在确定 AI 对常见的与第三磨牙拔除相关的患者问题的回答的准确性和可读性。
研究设计、地点、样本:这是对计算机生成报告的可读性和合理性的横截面计算机模拟评估。
不适用。
准确性,即提供临床正确和相关信息的能力,由 2 位评审员使用 5 分李克特量表进行主观判断,并通过与美国口腔颌面外科学会(AAOMS)临床共识文件的比较来客观判断。可读性,即文本的易读程度,使用弗莱什金纳阅读舒适度(FKRE)和弗莱什金纳等级(FKGL)进行评估。两者都基于每个单词的平均音节数和每个句子的单词数来评估可读性。为了被认为是可读的,FKRE 应该大于 60,FKGL 应该小于 8。
不适用。
使用描述性统计来分析本研究的结果。
AI 生成的回复高于推荐给普通患者的水平(FKRE:52;FKGL:10)。平均李克特分数为 4.36,这表明大多数回复是准确的,只有一些细微的不准确或缺失的信息。在没有明确答案的情况下,AI 正确地将回复转交给了提供者。在回答 AAOMS 共识文件中包含的内容的回复中,18/19 个回复与它们密切一致。所有的提示都没有提供引文或参考文献。
AI 能够提供大多准确的回复,内容与 AAOMS 指南密切一致。然而,回复对于普通的第三磨牙拔除患者来说过于复杂,并且缺乏引文和参考文献。重要的是,提供者应该教育患者 AI 的实用性,并决定是否推荐他们使用 AI 来获取信息。最终,回答问题的最佳资源来自从业者本身,因为 AI 平台缺乏临床经验。