评估ChatGPT生成的面向患者的鼻科学医学信息的质量和可读性。

Evaluating the quality and readability of ChatGPT-generated patient-facing medical information in rhinology.

作者信息

Fazilat Alexander Z, Brenac Camille, Kawamoto-Duran Danae, Berry Charlotte E, Alyono Jennifer, Chang Michael T, Liu David T, Patel Zara M, Tringali Stéphane, Wan Derrick C, Fieux Maxime

机构信息

Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, CA, USA.

Service de chirurgie plastique reconstructrice et esthétique, Hospices Civils de Lyon, Hôpital de la Croix Rousse, Lyon, F-69004, France.

出版信息

Eur Arch Otorhinolaryngol. 2025 Apr;282(4):1911-1920. doi: 10.1007/s00405-024-09180-0. Epub 2024 Dec 26.

DOI:10.1007/s00405-024-09180-0

PMID:39724239

Abstract

PURPOSE

The artificial intelligence (AI) chatbot ChatGPT has become a major tool for generating responses in healthcare. This study assessed ChatGPT's ability to generate French preoperative patient-facing medical information (PFI) in rhinology at a comparable level to material provided by an academic source, the French Society of Otorhinolaryngology (Société Française d'Otorhinolaryngologie et Chirurgie Cervico-Faciale, SFORL).

METHODS

ChatGPT and SFORL French preoperative PFI in rhinology were compared by analyzing responses to 16 questions regarding common rhinology procedures: ethmoidectomy, sphenoidotomy, septoplasty, and endonasal dacryocystorhinostomy. Twenty rhinologists assessed the clarity, comprehensiveness, accuracy, and overall quality of the information, while 24 nonmedical individuals analyzed the clarity and overall quality. Six readability formulas were used to compare readability scores.

RESULTS

Among rhinologists, no significant difference was found between ChatGPT and SFORL regarding clarity (7.61 ± 0.36 vs. 7.53 ± 0.28; p = 0.485), comprehensiveness (7.32 ± 0.77 vs. 7.58 ± 0.50; p = 0.872), and accuracy (inaccuracies: 60% vs. 40%; p = 0.228), respectively. Non-medical individuals scored the clarity of ChatGPT significantly higher than that of the SFORL (8.16 ± 1.16 vs. 6.32 ± 1.33; p < 0.0001). The non-medical individuals chose ChatGPT as the most informative source significantly more often than rhinologists (62.8% vs. 39.7%, p < 0.001).

CONCLUSION

ChatGPT-generated French preoperative PFI in rhinology was comparable to SFORL-provided PFI regarding clarity, comprehensiveness, accuracy, readability, and overall quality. This study highlights ChatGPT's potential to increase accessibility to high quality PFI and suggests its use by physicians as a complement to academic resources written by learned societies such as the SFORL.

摘要

目的

人工智能（AI）聊天机器人ChatGPT已成为医疗保健领域生成回复的主要工具。本研究评估了ChatGPT生成法语鼻科学术前患者医疗信息（PFI）的能力，其水平与学术来源——法国耳鼻咽喉科学会（Société Française d'Otorhinolaryngologie et Chirurgie Cervico-Faciale，SFORL）提供的材料相当。

方法

通过分析对16个关于常见鼻科手术（筛窦切除术、蝶窦切开术、鼻中隔成形术和鼻内泪囊鼻腔造口术）问题的回答，比较ChatGPT和SFORL的法语鼻科学术前PFI。20位鼻科医生评估了信息的清晰度、全面性、准确性和整体质量，而24位非医学人员分析了清晰度和整体质量。使用六种可读性公式比较可读性分数。

结果

在鼻科医生中，ChatGPT和SFORL在清晰度（7.61±0.36对7.53±0.28；p = 0.485）、全面性（7.32±0.77对7.58±0.50；p = 0.872）和准确性（错误率：60%对40%；p = 0.228）方面均未发现显著差异。非医学人员对ChatGPT清晰度的评分显著高于SFORL（8.16±1.16对6.32±1.33；p < 0.0001）。非医学人员选择ChatGPT作为信息最丰富来源的频率显著高于鼻科医生（62.8%对39.7%，p < 0.001）。

结论

ChatGPT生成的法语鼻科学术前PFI在清晰度、全面性、准确性、可读性和整体质量方面与SFORL提供的PFI相当。本研究突出了ChatGPT在提高高质量PFI可及性方面的潜力，并建议医生将其用作对SFORL等学术团体撰写的学术资源的补充。

相似文献

Evaluating the quality and readability of ChatGPT-generated patient-facing medical information in rhinology.评估ChatGPT生成的面向患者的鼻科学医学信息的质量和可读性。

Eur Arch Otorhinolaryngol. 2025 Apr;282(4):1911-1920. doi: 10.1007/s00405-024-09180-0. Epub 2024 Dec 26.

Both Patients and Plastic Surgeons Prefer Artificial Intelligence-Generated Microsurgical Information.患者和整形外科医生都更喜欢人工智能生成的显微手术信息。

J Reconstr Microsurg. 2024 Nov;40(9):657-664. doi: 10.1055/a-2273-4163. Epub 2024 Feb 21.

Assessing the ability of ChatGPT to generate French patient-facing information to improve patient understanding in hand surgery.评估ChatGPT生成面向法国患者的信息以提高手部手术患者理解度的能力。

Ann Chir Plast Esthet. 2025 Mar 19. doi: 10.1016/j.anplas.2025.02.005.

Readability, quality and accuracy of generative artificial intelligence chatbots for commonly asked questions about labor epidurals: a comparison of ChatGPT and Bard.生成式人工智能聊天机器人针对分娩硬膜外麻醉常见问题的可读性、质量和准确性：ChatGPT与Bard的比较

Int J Obstet Anesth. 2025 Feb;61:104317. doi: 10.1016/j.ijoa.2024.104317. Epub 2024 Dec 20.

Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性：与GPT-3.5和GPT-4的比较研究

JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.

AI-based Cleft Lip and Palate Surgical Information is Preferred by Both Plastic Surgeons and Patients in a Blind Comparison.在一项盲法比较中，基于人工智能的唇腭裂手术信息受到整形外科医生和患者的青睐。

Cleft Palate Craniofac J. 2024 Aug 1:10556656241266368. doi: 10.1177/10556656241266368.

Generative artificial intelligence chatbots may provide appropriate informational responses to common vascular surgery questions by patients.生成式人工智能聊天机器人可能会为患者关于常见血管外科问题提供恰当的信息性回复。

Vascular. 2025 Feb;33(1):229-237. doi: 10.1177/17085381241240550. Epub 2024 Mar 18.

American academy of Orthopedic Surgeons' OrthoInfo provides more readable information regarding meniscus injury than ChatGPT-4 while information accuracy is comparable.美国矫形外科医师学会的OrthoInfo在半月板损伤方面提供了比ChatGPT-4更具可读性的信息，而信息准确性相当。

J ISAKOS. 2025 Apr;11:100843. doi: 10.1016/j.jisako.2025.100843. Epub 2025 Feb 21.

Performance of Artificial Intelligence Chatbots on Glaucoma Questions Adapted From Patient Brochures.人工智能聊天机器人对改编自患者手册的青光眼问题的回答情况。

Cureus. 2024 Mar 23;16(3):e56766. doi: 10.7759/cureus.56766. eCollection 2024 Mar.

Appropriateness and readability of Google Bard and ChatGPT-3.5 generated responses for surgical treatment of glaucoma.谷歌巴德和 ChatGPT-3.5 生成的青光眼手术治疗回复的适宜性和可读性。

Rom J Ophthalmol. 2024 Jul-Sep;68(3):243-248. doi: 10.22336/rjo.2024.45.

引用本文的文献

The Application and Diagnostic Accuracy of Artificial Intelligence in Rhinology: A Review.人工智能在鼻科学中的应用与诊断准确性：综述

Cureus. 2025 Jul 15;17(7):e87966. doi: 10.7759/cureus.87966. eCollection 2025 Jul.

本文引用的文献

Evaluating ChatGPT-4's performance as a digital health advisor for otosclerosis surgery.评估ChatGPT-4作为耳硬化症手术数字健康顾问的表现。

Front Surg. 2024 Jun 5;11:1373843. doi: 10.3389/fsurg.2024.1373843. eCollection 2024.

Communicative competence of generative artificial intelligence in responding to patient queries about colorectal cancer surgery.生成式人工智能在回答结直肠癌手术患者询问方面的交际能力。

Int J Colorectal Dis. 2024 Jun 20;39(1):94. doi: 10.1007/s00384-024-04670-3.

Readability of Patient Education Materials in Head and Neck Cancer: A Systematic Review.头颈癌患者教育材料的可读性：一项系统综述

JAMA Otolaryngol Head Neck Surg. 2024 Aug 1;150(8):713-724. doi: 10.1001/jamaoto.2024.1569.

Enhancing Readability of Online Patient-Facing Content: The Role of AI Chatbots in Improving Cancer Information Accessibility.提高在线面向患者内容的可读性：人工智能聊天机器人在改善癌症信息可及性方面的作用。

J Natl Compr Canc Netw. 2024 May 15;22(2 D):e237334. doi: 10.6004/jnccn.2023.7334.

Quantitative Comparison of Chatbots on Common Rhinology Pathologies.常见鼻科学病症的聊天机器人定量比较。

Laryngoscope. 2024 Oct;134(10):4225-4231. doi: 10.1002/lary.31470. Epub 2024 Apr 26.

ChatGPT as an information tool in rhinology. Can we trust each other today?ChatGPT作为鼻科学中的一种信息工具。如今我们能相互信任吗？

Eur Arch Otorhinolaryngol. 2024 Jun;281(6):3253-3259. doi: 10.1007/s00405-024-08581-5. Epub 2024 Mar 4.

The ability of artificial intelligence tools to formulate orthopaedic clinical decisions in comparison to human clinicians: An analysis of ChatGPT 3.5, ChatGPT 4, and Bard.与人类临床医生相比，人工智能工具制定骨科临床决策的能力：对ChatGPT 3.5、ChatGPT 4和Bard的分析。

J Orthop. 2023 Dec 1;50:1-7. doi: 10.1016/j.jor.2023.11.063. eCollection 2024 Apr.

Artificial Intelligence in Plastic Surgery: ChatGPT as a Tool to Address Disparities in Health Literacy.整形手术中的人工智能：ChatGPT作为解决健康素养差异的工具

Plast Reconstr Surg. 2024 Jun 1;153(6):1232e-1234e. doi: 10.1097/PRS.0000000000011202. Epub 2023 Nov 14.

Comparison of ChatGPT-3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations.ChatGPT-3.5、ChatGPT-4 和骨科住院医师在骨科评估考试中的表现比较。

J Am Acad Orthop Surg. 2023 Dec 1;31(23):1173-1179. doi: 10.5435/JAAOS-D-23-00396. Epub 2023 Sep 4.

The application of ChatGPT in healthcare progress notes: A commentary from a clinical and research perspective.ChatGPT在医疗病程记录中的应用：来自临床与研究视角的评论

Clin Transl Med. 2023 Jul;13(7):e1324. doi: 10.1002/ctm2.1324.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

评估ChatGPT生成的面向患者的鼻科学医学信息的质量和可读性。

Evaluating the quality and readability of ChatGPT-generated patient-facing medical information in rhinology.

作者信息

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献