评估 ChatGPT 在提供腺样体切除术、扁桃体切除术和通气管插入手术的家长指导方面的准确性和可读性。

Evaluating the accuracy and readability of ChatGPT in providing parental guidance for adenoidectomy, tonsillectomy, and ventilation tube insertion surgery.

机构信息

Department of Otorhinolaryngology, Faculty of Medicine, Bezmialem Vakif University, Fatih, Istanbul, Turkey.

Department of Radiology, Faculty of Medicine, Bezmialem Vakif University, Fatih, Istanbul, Turkey.

出版信息

Int J Pediatr Otorhinolaryngol. 2024 Jun;181:111998. doi: 10.1016/j.ijporl.2024.111998. Epub 2024 May 31.

DOI:10.1016/j.ijporl.2024.111998

PMID:38830271

Abstract

OBJECTIVES

This study examined the potential of ChatGPT as an accurate and readable source of information for parents seeking guidance on adenoidectomy, tonsillectomy, and ventilation tube insertion surgeries (ATVtis).

METHODS

ChatGPT was tasked with identifying the top 15 most frequently asked questions by parents on internet search engines for each of the three specific surgical procedures. We removed repeated questions from the initial set of 45. Subsequently, we asked ChatGPT to generate answers to the remaining 33 questions. Seven highly experienced otolaryngologists individually assessed the accuracy of the responses using a four-level grading scale, from completely incorrect to comprehensive. The readability of responses was determined using the Flesch Reading Ease (FRE) and Flesch-Kincaid Grade Level (FKGL) scores. The questions were categorized into four groups: Diagnosis and Preparation Process, Surgical Information, Risks and Complications, and Postoperative Process. Responses were then compared based on accuracy grade, FRE, and FKGL scores.

RESULTS

Seven evaluators each assessed 33 AI-generated responses, providing a total of 231 evaluations. Among the evaluated responses, 167 (72.3 %) were classified as 'comprehensive.' Sixty-two responses (26.8 %) were categorized as 'correct but inadequate,' and two responses (0.9 %) were assessed as 'some correct, some incorrect.' None of the responses were adjudged 'completely incorrect' by any assessors. The average FRE and FGKL scores were 57.15(±10.73) and 9.95(±1.91), respectively. Upon analyzing the responses from ChatGPT, 3 (9.1 %) were at or below the sixth-grade reading level recommended by the American Medical Association (AMA). No significant differences were found between the groups regarding readability and accuracy scores (p > 0.05).

CONCLUSIONS

ChatGPT can provide accurate answers to questions on various topics related to ATVtis. However, ChatGPT's answers may be too complex for some readers, as they are generally written at a high school level. This is above the sixth-grade reading level recommended for patient information by the AMA. According to our study, more than three-quarters of the AI-generated responses were at or above the 10th-grade reading level, raising concerns about the ChatGPT text's readability.

摘要

目的

本研究旨在探讨 ChatGPT 作为一种准确且易于理解的信息来源，为寻求腺样体切除术、扁桃体切除术和通气管插入术（ATVtis）相关指导的家长提供帮助。

方法

我们要求 ChatGPT 识别互联网搜索引擎上针对这三种特定手术的每个手术程序的前 15 个最常见的家长问题。我们从最初的 45 个问题中删除了重复的问题。随后，我们要求 ChatGPT 针对剩余的 33 个问题生成答案。七位经验丰富的耳鼻喉科医生分别使用四级评分量表（从完全不正确到全面）评估回复的准确性。使用弗莱什阅读易读性得分（FRE）和弗莱什-金凯德年级水平（FKGL）评分来确定回复的可读性。问题被分为四个组：诊断和准备过程、手术信息、风险和并发症以及术后过程。然后根据准确性等级、FRE 和 FKGL 评分对回复进行比较。

结果

七位评估者分别评估了 33 个 AI 生成的回复，共提供了 231 次评估。在所评估的回复中，167 个（72.3%）被归类为“全面”。62 个回复（26.8%）被归类为“正确但不充分”，2 个回复（0.9%）被评估为“部分正确，部分错误”。没有任何评估者认为任何回复是“完全不正确”的。平均 FRE 和 FGKL 得分为 57.15（±10.73）和 9.95（±1.91）。通过分析 ChatGPT 的回复，有 3 个（9.1%）回复处于或低于美国医学协会（AMA）推荐的六年级阅读水平。在可读性和准确性评分方面，各组之间没有显著差异（p>0.05）。

结论

ChatGPT 可以为与 ATVtis 相关的各种主题提供准确的答案。然而，ChatGPT 的答案可能对一些读者来说过于复杂，因为它们通常是在高中水平编写的。这高于 AMA 为患者信息推荐的六年级阅读水平。根据我们的研究，超过四分之三的 AI 生成的回复达到或高于 10 年级的阅读水平，这引起了人们对 ChatGPT 文本可读性的担忧。

相似文献

Evaluating the accuracy and readability of ChatGPT in providing parental guidance for adenoidectomy, tonsillectomy, and ventilation tube insertion surgery.评估 ChatGPT 在提供腺样体切除术、扁桃体切除术和通气管插入手术的家长指导方面的准确性和可读性。

Int J Pediatr Otorhinolaryngol. 2024 Jun;181:111998. doi: 10.1016/j.ijporl.2024.111998. Epub 2024 May 31.

Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery.谷歌医生与ChatGPT医生：通过比较关于白内障及白内障手术的常见患者问题的回答的准确性、安全性和可读性，探索人工智能在眼科领域的应用。

Semin Ophthalmol. 2024 Aug;39(6):472-479. doi: 10.1080/08820538.2024.2326058. Epub 2024 Mar 22.

Accuracy and Readability of Artificial Intelligence Chatbot Responses to Vasectomy-Related Questions: Public Beware.人工智能聊天机器人对输精管切除术相关问题回答的准确性和可读性：公众需谨慎。

Cureus. 2024 Aug 28;16(8):e67996. doi: 10.7759/cureus.67996. eCollection 2024 Aug.

Information Quality and Readability: ChatGPT's Responses to the Most Common Questions About Spinal Cord Injury.信息质量与可读性：ChatGPT 对脊髓损伤常见问题的回答

World Neurosurg. 2024 Jan;181:e1138-e1144. doi: 10.1016/j.wneu.2023.11.062. Epub 2023 Nov 22.

A quality and readability comparison of artificial intelligence and popular health website education materials for common hand surgery procedures.常见手部手术程序的人工智能与热门健康网站教育资料的质量和可读性比较。

Hand Surg Rehabil. 2024 Jun;43(3):101723. doi: 10.1016/j.hansur.2024.101723. Epub 2024 May 21.

Readability of American Society of Metabolic Surgery's Patient Information Publications.美国代谢外科学会患者信息出版物的可读性。

J Surg Res. 2024 Jan;293:727-732. doi: 10.1016/j.jss.2023.09.018. Epub 2023 Oct 18.

Readability analysis of ChatGPT's responses on lung cancer.肺癌相关问题的 ChatGPT 回复可读性分析。

Sci Rep. 2024 Jul 26;14(1):17234. doi: 10.1038/s41598-024-67293-2.

ChatGPT vs. web search for patient questions: what does ChatGPT do better?ChatGPT 与网页搜索在解答患者问题上的对比：ChatGPT 有哪些优势？

Eur Arch Otorhinolaryngol. 2024 Jun;281(6):3219-3225. doi: 10.1007/s00405-024-08524-0. Epub 2024 Feb 28.

Evaluation of the reliability and readability of ChatGPT-4 responses regarding hypothyroidism during pregnancy.评估 ChatGPT-4 在妊娠期间甲状腺功能减退症相关问题的回复的可靠性和可读性。

Sci Rep. 2024 Jan 2;14(1):243. doi: 10.1038/s41598-023-50884-w.

Both Patients and Plastic Surgeons Prefer Artificial Intelligence-Generated Microsurgical Information.患者和整形外科医生都更喜欢人工智能生成的显微手术信息。

J Reconstr Microsurg. 2024 Nov;40(9):657-664. doi: 10.1055/a-2273-4163. Epub 2024 Feb 21.

引用本文的文献

Generative AI/LLMs for Plain Language Medical Information for Patients, Caregivers and General Public: Opportunities, Risks and Ethics.用于为患者、护理人员和普通公众提供通俗易懂的医学信息的生成式人工智能/大型语言模型：机遇、风险与伦理

Patient Prefer Adherence. 2025 Jul 31;19:2227-2249. doi: 10.2147/PPA.S527922. eCollection 2025.

Leveraging ChatGPT to strengthen pediatric healthcare systems: a systematic review.利用ChatGPT加强儿科医疗系统：一项系统综述

Eur J Pediatr. 2025 Jul 12;184(8):478. doi: 10.1007/s00431-025-06320-4.

Applications of Natural Language Processing in Otolaryngology: A Scoping Review.自然语言处理在耳鼻咽喉科的应用：一项范围综述

Laryngoscope. 2025 Sep;135(9):3049-3063. doi: 10.1002/lary.32198. Epub 2025 May 1.

Comparative Analysis of Artificial Intelligence Platforms in Generating Post-Operative Instructions for Rhinologic Surgery.人工智能平台生成鼻科手术术后指导的比较分析

Indian J Otolaryngol Head Neck Surg. 2025 Jan;77(1):601-603. doi: 10.1007/s12070-024-05161-1. Epub 2024 Nov 2.

Artificial intelligence in otorhinolaryngology: current trends and application areas.耳鼻咽喉科学中的人工智能：当前趋势与应用领域

Eur Arch Otorhinolaryngol. 2025 May;282(5):2697-2707. doi: 10.1007/s00405-025-09272-5. Epub 2025 Feb 17.

Evaluation of the quality and readability of ChatGPT responses to frequently asked questions about myopia in traditional Chinese language.评估ChatGPT对中文常见近视相关问题的回答质量和可读性。

Digit Health. 2024 Sep 2;10:20552076241277021. doi: 10.1177/20552076241277021. eCollection 2024 Jan-Dec.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

评估 ChatGPT 在提供腺样体切除术、扁桃体切除术和通气管插入手术的家长指导方面的准确性和可读性。

Evaluating the accuracy and readability of ChatGPT in providing parental guidance for adenoidectomy, tonsillectomy, and ventilation tube insertion surgery.

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献