• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大语言模型在提高骨科脊柱患者教育材料可读性方面的作用。

The role of large language models in improving the readability of orthopaedic spine patient educational material.

作者信息

Romoff Melissa, Brunette Madison, Peterson Melanie K, Hashmi Sohaib Z, Kim Michael S

机构信息

Department of Orthopaedic Surgery, University of California, Irvine, School of Medicine, 101 The City Dr S, Pavilion 3, Building 29 A, Orange, CA, 92868, USA.

出版信息

J Orthop Surg Res. 2025 May 28;20(1):531. doi: 10.1186/s13018-025-05955-1.

DOI:10.1186/s13018-025-05955-1
PMID:40426209
Abstract

INTRODUCTION

Patient education is crucial for informed decision-making. Current educational materials are often written at a higher grade level than the American Medical Association (AMA)-recommended sixth-grade level. Few studies have assessed the readability of orthopaedic materials such as American Academy of Orthopaedic Surgeons (AAOS) OrthoInfo articles, and no studies have suggested efficient methods to improve readability. This study assessed the readability of OrthoInfo spine articles and investigated the ability of large language models (LLMs) to improve readability.

METHODS

A cross-sectional study analyzed 19 OrthoInfo articles using validated readability metrics (Flesch-Kincaid Grade Level and Reading Ease). Articles were simplified iteratively in three steps using ChatGPT, Gemini, and CoPilot. LLMs were prompted to summarize text, followed by two clarification prompts simulating patient inquiries. Word count, readability, and accuracy were assessed at each step. Accuracy was rated by two independent reviewers using a three-point scale (3 = fully accurate, 2 = minor inaccuracies, 1 = major inaccuracies). Statistical analysis included one-way and two-way ANOVA, followed by Tukey post-hoc tests for pairwise comparisons.

RESULTS

Baseline readability exceeded AMA recommendations, with a mean Flesch-Kincaid Grade Level of 9.5 and a Reading Ease score of 51.1. LLM summaries provided statistically significant improvement in readability, with the greatest improvements in the first iteration. All three LLMs performed similarly, though ChatGPT achieved statistically significant improvements in Reading Ease scores. Gemini incorporated appropriate disclaimers most consistently. Accuracy remained stable throughout, with no evidence of hallucination or compromise in content quality or medical relevance.

DISCUSSION

LLMs effectively simplify orthopaedic educational content by reducing grade levels, enhancing readability, and maintaining acceptable accuracy. Readability improvements were most significant in initial simplification steps, with all models performing consistently. These findings support the integration of LLMs into patient education workflows, offering a scalable strategy to improve health literacy, enhance patient comprehension, and promote more equitable access to medical information across diverse populations.

摘要

引言

患者教育对于做出明智的决策至关重要。当前的教育材料通常以高于美国医学协会(AMA)推荐的六年级水平编写。很少有研究评估过诸如美国矫形外科医师学会(AAOS)的OrthoInfo文章等骨科材料的可读性,并且没有研究提出提高可读性的有效方法。本研究评估了OrthoInfo脊柱文章的可读性,并研究了大语言模型(LLMs)提高可读性的能力。

方法

一项横断面研究使用经过验证的可读性指标(弗莱施-金凯德年级水平和阅读简易度)分析了19篇OrthoInfo文章。使用ChatGPT、Gemini和CoPilot分三步对文章进行迭代简化。提示大语言模型总结文本,随后是两个模拟患者询问的澄清提示。在每个步骤中评估单词数、可读性和准确性。准确性由两名独立的评审员使用三点量表进行评分(3 = 完全准确,2 = 轻微不准确,1 = 严重不准确)。统计分析包括单向和双向方差分析,随后进行Tukey事后检验以进行成对比较。

结果

基线可读性超过了AMA的建议,平均弗莱施-金凯德年级水平为9.5,阅读简易度得分为51.1。大语言模型的总结在可读性方面提供了具有统计学意义的改善,在第一次迭代中改善最为显著。所有三个大语言模型的表现相似,尽管ChatGPT在阅读简易度得分方面实现了具有统计学意义的改善。Gemini最始终如一地纳入了适当的免责声明。准确性在整个过程中保持稳定,没有出现幻觉或内容质量和医学相关性受损的迹象。

讨论

大语言模型通过降低年级水平、提高可读性和保持可接受的准确性有效地简化了骨科教育内容。在最初的简化步骤中,可读性的改善最为显著,所有模型的表现都很一致。这些发现支持将大语言模型整合到患者教育工作流程中,提供了一种可扩展的策略来提高健康素养、增强患者理解并促进不同人群更公平地获取医疗信息。

相似文献

1
The role of large language models in improving the readability of orthopaedic spine patient educational material.大语言模型在提高骨科脊柱患者教育材料可读性方面的作用。
J Orthop Surg Res. 2025 May 28;20(1):531. doi: 10.1186/s13018-025-05955-1.
2
Readability of Trauma-related Patient Education Materials From the American Academy of Orthopaedic Surgeons and Orthopaedic Trauma Association Websites.美国矫形外科医师学会和创伤骨科协会网站上创伤相关患者教育材料的可读性。
J Am Acad Orthop Surg. 2024 Jul 1;32(13):e642-e650. doi: 10.5435/JAAOS-D-23-00449. Epub 2024 Apr 25.
3
Readability of Online Pediatric Orthopaedic Trauma Patient Education Materials.在线儿童骨科创伤患者教育材料的可读性
J Am Acad Orthop Surg. 2025 May 1;33(9):e502-e510. doi: 10.5435/JAAOS-D-24-00617. Epub 2024 Dec 18.
4
American academy of Orthopedic Surgeons' OrthoInfo provides more readable information regarding meniscus injury than ChatGPT-4 while information accuracy is comparable.美国矫形外科医师学会的OrthoInfo在半月板损伤方面提供了比ChatGPT-4更具可读性的信息,而信息准确性相当。
J ISAKOS. 2025 Apr;11:100843. doi: 10.1016/j.jisako.2025.100843. Epub 2025 Feb 21.
5
De novo generation of colorectal patient educational materials using large language models: Prompt engineering key to improved readability.使用大语言模型从头生成结直肠癌患者教育材料:提示工程是提高可读性的关键。
Surgery. 2025 Apr;180:109024. doi: 10.1016/j.surg.2024.109024. Epub 2025 Jan 4.
6
Assessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study.评估大语言模型在根据阅读水平生成皮肤科患者教育材料方面的应用:定性研究。
JMIR Dermatol. 2024 May 16;7:e55898. doi: 10.2196/55898.
7
Using Large Language Models to Generate Educational Materials on Childhood Glaucoma.利用大语言模型生成儿童青光眼教育材料。
Am J Ophthalmol. 2024 Sep;265:28-38. doi: 10.1016/j.ajo.2024.04.004. Epub 2024 Apr 16.
8
The Readability of AAOS Patient Education Materials: Evaluating the Progress Since 2008.美国矫形外科医师学会患者教育材料的可读性:评估自2008年以来的进展
J Bone Joint Surg Am. 2016 Sep 7;98(17):e70. doi: 10.2106/JBJS.15.00658.
9
Readability, accuracy and appropriateness and quality of AI chatbot responses as a patient information source on root canal retreatment: A comparative assessment.作为根管再治疗患者信息来源的人工智能聊天机器人回复的可读性、准确性、恰当性和质量:一项比较评估。
Int J Med Inform. 2025 Sep;201:105948. doi: 10.1016/j.ijmedinf.2025.105948. Epub 2025 Apr 25.
10
Artificial Intelligence Shows Limited Success in Improving Readability Levels of Spanish-language Orthopaedic Patient Education Materials.人工智能在提高西班牙语骨科患者教育材料的可读性方面成效有限。
Clin Orthop Relat Res. 2025 Feb 11. doi: 10.1097/CORR.0000000000003413.

引用本文的文献

1
Perceived Accuracy of Spine-Related Medical Advice From ChatGPT, TikTok, and the North American Spine Society Clinical Practice Guidelines.来自ChatGPT、TikTok以及北美脊柱协会临床实践指南的脊柱相关医学建议的感知准确性
Cureus. 2025 Jul 26;17(7):e88808. doi: 10.7759/cureus.88808. eCollection 2025 Jul.

本文引用的文献

1
Addressing Issues of Inclusive Workplace Culture for Women Orthopaedic Surgeons in Academia: A Qualitative Investigation.解决学术界女性骨科外科医生的包容性职场文化问题:一项定性研究。
J Bone Joint Surg Am. 2025 May 21;107(10):e48. doi: 10.2106/JBJS.24.01134. Epub 2025 Mar 28.
2
Evaluating the Performance of Artificial Intelligence for Improving Readability of Online English- and Spanish-Language Orthopaedic Patient Educational Material: Challenges in Bridging the Digital Divide.评估人工智能在提高在线英语和西班牙语骨科患者教育材料可读性方面的性能:弥合数字鸿沟面临的挑战。
J Bone Joint Surg Am. 2025 Apr 16;107(8):e36. doi: 10.2106/JBJS.24.01078. Epub 2025 Feb 28.
3
Source Characteristics Influence AI-Enabled Orthopaedic Text Simplification: Recommendations for the Future.
源特征影响基于人工智能的骨科文本简化:对未来的建议。
JB JS Open Access. 2025 Jan 8;10(1). doi: 10.2106/JBJS.OA.24.00007. eCollection 2025 Jan-Mar.
4
Leveraging large language models to improve patient education on dry eye disease.利用大语言模型改善干眼症患者教育。
Eye (Lond). 2025 Apr;39(6):1115-1122. doi: 10.1038/s41433-024-03476-5. Epub 2024 Dec 16.
5
Implications of Large Language Models for Clinical Practice: Ethical Analysis Through the Principlism Framework.大语言模型对临床实践的影响:通过原则主义框架进行伦理分析
J Eval Clin Pract. 2025 Feb;31(1):e14250. doi: 10.1111/jep.14250.
6
Large Language Models for Intraoperative Decision Support in Plastic Surgery: A Comparison between ChatGPT-4 and Gemini.大型语言模型在整形手术中的术中决策支持:ChatGPT-4 和 Gemini 的比较。
Medicina (Kaunas). 2024 Jun 8;60(6):957. doi: 10.3390/medicina60060957.
7
Application of generative language models to orthopaedic practice.生成式语言模型在骨科实践中的应用。
BMJ Open. 2024 Mar 14;14(3):e076484. doi: 10.1136/bmjopen-2023-076484.
8
Readability and Health Literacy Scores for ChatGPT-Generated Dermatology Public Education Materials: Cross-Sectional Analysis of Sunscreen and Melanoma Questions.ChatGPT生成的皮肤科公共教育材料的可读性和健康素养得分:防晒霜与黑色素瘤问题的横断面分析
JMIR Dermatol. 2024 Mar 6;7:e50163. doi: 10.2196/50163.
9
Can Artificial Intelligence Improve the Readability of Patient Education Materials?人工智能能否提高患者教育材料的可读性?
Clin Orthop Relat Res. 2023 Nov 1;481(11):2260-2267. doi: 10.1097/CORR.0000000000002668. Epub 2023 Apr 28.
10
Health Literacy, Health Outcomes and Equity: A Trend Analysis Based on a Population Survey.健康素养、健康结果和公平性:基于人口调查的趋势分析。
J Prim Care Community Health. 2023 Jan-Dec;14:21501319231156132. doi: 10.1177/21501319231156132.