文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于互联网的梅尼埃病患者教育材料和大语言模型的可读性、可靠性及质量分析

Readability, Reliability, and Quality Analysis of Internet-Based Patient Education Materials and Large Language Models on Meniere's Disease.

作者信息

Alamleh Salahaldin, Mavedatnia Dorsa, Francis Gizelle, Le Trung, Davies Joel, Lin Vincent, Lee John J W

机构信息

Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.

Department of Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, ON, Canada.

出版信息

J Otolaryngol Head Neck Surg. 2025 Jan-Dec;54:19160216251360651. doi: 10.1177/19160216251360651. Epub 2025 Aug 8.


DOI:10.1177/19160216251360651
PMID:40776601
Abstract

ImportanceOnline patient education materials (PEMs) and large language model (LLM) outputs can provide critical health information for patients, yet their readability, quality, and reliability remain unclear for Meniere's disease.ObjectiveTo assess the readability, quality, and reliability of online PEMs and LLM-generated outputs on Meniere's disease.DesignCross-sectional study.SettingPEMs were identified from the first 40 Google Search results based on inclusion criteria. LLM outputs were extracted from unique interactions with ChatGPT and Google Gemini.ParticipantsThirty-one PEMs met inclusion criteria. LLM outputs were obtained from 3 unique interactions each with ChatGPT and Google Gemini.InterventionReadability was assessed using 5 validated formulas [Flesch Reading Ease (FRE), Flesch Kincaid Grade Level (FKGL), Gunning-Fog Index, Coleman-Liau Index, and Simple Measure of Gobbledygook Index]. Quality and reliability were assessed by 2 independent raters using the DISCERN tool.Main Outcome MeasuresReadability was assessed for adherence to the American Medical Association's (AMA) sixth-grade reading level guideline. Source reliability, as well as the completeness, accuracy, and clarity of treatment-related information, was evaluated using the DISCERN tool.ResultsThe most common PEM source type was academic institutions (32.2%), while the majority of PEMs (61.3%) originated from the United States. The mean FRE score for PEMs corresponded to a 10th- to 12th-grade reading level, whereas ChatGPT and Google Gemini outputs were classified at post-graduate and college reading levels, respectively. Only 16.1% of PEMs met the AMA's sixth-grade readability recommendation using the FKGL readability index, and no LLM outputs achieved this standard. Overall DISCERN scores categorized PEMs and ChatGPT outputs as "poor quality," while Google Gemini outputs were rated "fair quality." No significant differences were found in readability or DISCERN scores across PEM source types. Additionally, no significant correlation was identified between PEM readability, quality, and reliability scores.ConclusionsOnline PEMs and LLM-generated outputs on Meniere's disease do not meet AMA readability standards and are generally of poor quality and reliability.RelevanceFuture PEMs should prioritize improved readability while maintaining high-quality, reliable information to better support patient decision-making for patients with Meniere's disease.

摘要

重要性 在线患者教育材料(PEMs)和大语言模型(LLM)的输出可为患者提供关键的健康信息,但其针对梅尼埃病的可读性、质量和可靠性仍不明确。 目的 评估在线PEMs和LLM生成的关于梅尼埃病的输出内容的可读性、质量和可靠性。 设计 横断面研究。 设置 根据纳入标准,从谷歌搜索结果的前40条中识别PEMs。从与ChatGPT和谷歌Gemini的独特交互中提取LLM输出。 参与者 31份PEMs符合纳入标准。分别从与ChatGPT和谷歌Gemini的3次独特交互中获取LLM输出。 干预 使用5种经过验证的公式[弗莱什易读性(FRE)、弗莱什-金凯德年级水平(FKGL)、冈宁-福格指数、科尔曼-廖指数和简单费解度指数]评估可读性。由2名独立评估者使用DISCERN工具评估质量和可靠性。 主要结局指标 评估可读性是否符合美国医学协会(AMA)六年级阅读水平指南。使用DISCERN工具评估来源可靠性以及治疗相关信息的完整性、准确性和清晰度。 结果 最常见的PEM来源类型是学术机构(32.2%),而大多数PEMs(61.3%)来自美国。PEMs的平均FRE分数对应于十至十二年级的阅读水平,而ChatGPT和谷歌Gemini的输出分别归类为研究生和大学阅读水平。使用FKGL可读性指数时,只有16.1%的PEMs符合AMA六年级可读性建议,且没有LLM输出达到该标准。总体DISCERN分数将PEMs和ChatGPT输出归类为“质量差”, 而谷歌Gemini输出被评为“质量一般”。不同PEM来源类型的可读性或DISCERN分数未发现显著差异。此外,PEM可读性、质量和可靠性分数之间未发现显著相关性。 结论 关于梅尼埃病的在线PEMs和LLM生成的输出不符合AMA可读性标准,且质量和可靠性普遍较差。 相关性 未来的PEMs应在保持高质量、可靠信息的同时,优先提高可读性,以更好地支持梅尼埃病患者的决策。

相似文献

[1]
Readability, Reliability, and Quality Analysis of Internet-Based Patient Education Materials and Large Language Models on Meniere's Disease.

J Otolaryngol Head Neck Surg. 2025

[2]
Enhancing the Readability of Online Patient Education Materials Using Large Language Models: Cross-Sectional Study.

J Med Internet Res. 2025-6-4

[3]
Currently Available Large Language Models Are Moderately Effective in Improving Readability of English and Spanish Patient Education Materials in Pediatric Orthopaedics.

J Am Acad Orthop Surg. 2025-6-24

[4]
Improving the Readability of Institutional Heart Failure-Related Patient Education Materials Using GPT-4: Observational Study.

JMIR Cardio. 2025-7-8

[5]
Is Information About Musculoskeletal Malignancies From Large Language Models or Web Resources at a Suitable Reading Level for Patients?

Clin Orthop Relat Res. 2025-2-1

[6]
Leveraging large language models to improve patient education on dry eye disease.

Eye (Lond). 2025-4

[7]
Leveraging Large Language Models to Enhance Patient Educational Resources in Rhinology.

Ann Otol Rhinol Laryngol. 2025-9

[8]
ChatGPT-4 in Neurosurgery: Improving Patient Education Materials.

Neurosurgery. 2025-7-24

[9]
Readability of Online Patient Education Materials for Congenital Hand Differences.

Hand (N Y). 2024-10

[10]
Bridging Health Literacy Gaps in Spine Care: Using ChatGPT-4o to Improve Patient-Education Materials.

J Bone Joint Surg Am. 2025-6-19

本文引用的文献

[1]
Current applications and challenges in large language models for patient care: a systematic review.

Commun Med (Lond). 2025-1-21

[2]
The Effect of Health Literacy on Disease Management Self-Efficacy in Chronic Disease Patients: The Mediating Effects of Social Support and the Moderating Effects of Illness Perception.

Patient Prefer Adherence. 2024-3-12

[3]
Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis.

J Biomed Inform. 2024-3

[4]
Optimizing Ophthalmology Patient Education via ChatBot-Generated Materials: Readability Analysis of AI-Generated Patient Education Materials and The American Society of Ophthalmic Plastic and Reconstructive Surgery Patient Brochures.

Ophthalmic Plast Reconstr Surg.

[5]
BPPV Information on Google Versus AI (ChatGPT).

Otolaryngol Head Neck Surg. 2024-6

[6]
Quality of information and appropriateness of ChatGPT outputs for urology patients.

Prostate Cancer Prostatic Dis. 2024-3

[7]
Caution! AI Bot Has Entered the Patient Chat: ChatGPT Has Limitations in Providing Accurate Urologic Healthcare Advice.

Urology. 2023-10

[8]
Large language models in medicine.

Nat Med. 2023-8

[9]
A critical readability and quality analysis of internet-based patient information on neck dissections.

World J Otorhinolaryngol Head Neck Surg. 2022-4-26

[10]
ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns.

Healthcare (Basel). 2023-3-19

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索