源特征影响基于人工智能的骨科文本简化：对未来的建议。

Source Characteristics Influence AI-Enabled Orthopaedic Text Simplification: Recommendations for the Future.

作者信息

Andalib Saman, Solomon Sean S, Picton Bryce G, Spina Aidin C, Scolaro John A, Nelson Ariana M

机构信息

University of California, Irvine, School of Medicine, Irvine, California.

Department of Orthopaedic Surgery, University of California, Irvine, Medical Center, Orange, California.

出版信息

JB JS Open Access. 2025 Jan 8;10(1). doi: 10.2106/JBJS.OA.24.00007. eCollection 2025 Jan-Mar.

DOI:10.2106/JBJS.OA.24.00007

PMID:39781102

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11703440/

Abstract

BACKGROUND

This study assesses the effectiveness of large language models (LLMs) in simplifying complex language within orthopaedic patient education materials (PEMs) and identifies predictive factors for successful text transformation.

METHODS

We transformed 48 orthopaedic PEMs using GPT-4, GPT-3.5, Claude 2, and Llama 2. The readability, quantified by the Flesch-Kincaid Reading Ease (FKRE) and Flesch-Kincaid Grade Level (FKGL) scores, was measured before and after transformation. Analysis included text characteristics such as syllable count, word length, and sentence length. Statistical and machine learning methods evaluated the correlations and predictive capacity of these features for transformation success.

RESULTS

All LLMs improved FKRE and FKGL scores (p < 0.01). GPT-4 showed superior performance, transforming PEMs to a seventh-grade reading level (mean FKGL, 6.72 ± 0.99), with higher FKRE and lower FKGL than other models. GPT-3.5, Claude 2, and Llama 2 significantly shortened sentences and overall text length (p < 0.01). Importantly, correlation analysis revealed that transformation success varied substantially with the model used, depending on original text factors such as word length and sentence complexity.

CONCLUSIONS

LLMs successfully simplify orthopaedic PEMs, with GPT-4 leading in readability improvement. This study highlights the importance of initial text characteristics in determining the effectiveness of LLM transformations, offering insights for optimizing orthopaedic health literacy initiatives using artificial intelligence (AI).

CLINICAL RELEVANCE

This study provides critical insights into the ability of LLMs to simplify complex orthopaedic PEMs, enhancing their readability without compromising informational integrity. By identifying predictive factors for successful text transformation, this research supports the application of AI in improving health literacy, potentially leading to better patient comprehension and outcomes in orthopaedic care.

摘要

背景

本研究评估大语言模型（LLMs）在简化骨科患者教育材料（PEMs）中复杂语言方面的有效性，并确定成功进行文本转换的预测因素。

方法

我们使用GPT-4、GPT-3.5、Claude 2和Llama 2对48份骨科PEMs进行了转换。通过弗莱什-金凯德阅读简易度（FKRE）和弗莱什-金凯德年级水平（FKGL）分数对转换前后的可读性进行了测量。分析包括音节数、单词长度和句子长度等文本特征。统计和机器学习方法评估了这些特征与转换成功之间的相关性和预测能力。

结果

所有大语言模型均提高了FKRE和FKGL分数（p < 0.01）。GPT-4表现出卓越的性能，将PEMs转换至七年级阅读水平（平均FKGL，6.72 ± 0.99），其FKRE更高且FKGL低于其他模型。GPT-3.5、Claude 2和Llama 2显著缩短了句子和整体文本长度（p < 0.01）。重要的是，相关性分析显示，转换成功与否因所使用的模型而有很大差异，这取决于单词长度和句子复杂度等原始文本因素。

结论

大语言模型成功简化了骨科PEMs，其中GPT-4在提高可读性方面领先。本研究强调了初始文本特征在确定大语言模型转换有效性方面的重要性，为利用人工智能（AI）优化骨科健康素养举措提供了见解。

临床相关性

本研究为大语言模型简化复杂骨科PEMs的能力提供了关键见解，在不影响信息完整性的情况下提高了其可读性。通过确定成功进行文本转换的预测因素，本研究支持将AI应用于提高健康素养，这可能会改善骨科护理中患者的理解和治疗效果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebd1/11703440/6e5b8db31efb/jbjsoa-10-e24.00007-g001.jpg

相似文献

Source Characteristics Influence AI-Enabled Orthopaedic Text Simplification: Recommendations for the Future.源特征影响基于人工智能的骨科文本简化：对未来的建议。

JB JS Open Access. 2025 Jan 8;10(1). doi: 10.2106/JBJS.OA.24.00007. eCollection 2025 Jan-Mar.

Evaluation of Generative Language Models in Personalizing Medical Information: Instrument Validation Study.生成式语言模型在个性化医疗信息方面的评估：工具验证研究

JMIR AI. 2024 Aug 13;3:e54371. doi: 10.2196/54371.

Assessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study.评估大语言模型在根据阅读水平生成皮肤科患者教育材料方面的应用：定性研究。

JMIR Dermatol. 2024 May 16;7:e55898. doi: 10.2196/55898.

Assessing AI Simplification of Medical Texts: Readability and Content Fidelity.评估人工智能对医学文本的简化：可读性与内容保真度。

Int J Med Inform. 2025 Mar;195:105743. doi: 10.1016/j.ijmedinf.2024.105743. Epub 2024 Dec 1.

Large language models and bariatric surgery patient education: a comparative readability analysis of GPT-3.5, GPT-4, Bard, and online institutional resources.大型语言模型和减重手术患者教育：GPT-3.5、GPT-4、Bard 与在线机构资源的可读性比较分析。

Surg Endosc. 2024 May;38(5):2522-2532. doi: 10.1007/s00464-024-10720-2. Epub 2024 Mar 12.

Tailoring glaucoma education using large language models: Addressing health disparities in patient comprehension.使用大语言模型定制青光眼教育：解决患者理解方面的健康差异问题。

Medicine (Baltimore). 2025 Jan 10;104(2):e41059. doi: 10.1097/MD.0000000000041059.

Can Artificial Intelligence Improve the Readability of Patient Education Materials on Aortic Stenosis? A Pilot Study.人工智能能否提高主动脉瓣狭窄患者教育材料的可读性？一项试点研究。

Cardiol Ther. 2024 Mar;13(1):137-147. doi: 10.1007/s40119-023-00347-0. Epub 2024 Jan 9.

Large Language Models for Simplified Interventional Radiology Reports: A Comparative Analysis.用于简化介入放射学报告的大语言模型：一项比较分析

Acad Radiol. 2025 Feb;32(2):888-898. doi: 10.1016/j.acra.2024.09.041. Epub 2024 Sep 30.

A standardised method for improving patient education material readability for orthopaedic trauma patients.一种用于提高骨科创伤患者患者教育材料可读性的标准化方法。

Musculoskeletal Care. 2024 Mar;22(1):e1869. doi: 10.1002/msc.1869.

Improving Accessibility to Facial Plastic and Reconstructive Surgery Patient Resources Using Artificial Intelligence: A Pilot Study in Patient Education Materials.利用人工智能提高面部整形与重建外科患者资源的可及性：患者教育材料的一项试点研究

Facial Plast Surg Aesthet Med. 2025 Apr 16. doi: 10.1089/fpsam.2024.0376.

引用本文的文献

Using AI to Translate and Simplify Spanish Orthopedic Medical Text: Instrument Validation Study.使用人工智能翻译和简化西班牙语骨科医学文本：仪器验证研究。

JMIR AI. 2025 Mar 21;4:e70222. doi: 10.2196/70222.

The role of large language models in improving the readability of orthopaedic spine patient educational material.大语言模型在提高骨科脊柱患者教育材料可读性方面的作用。

J Orthop Surg Res. 2025 May 28;20(1):531. doi: 10.1186/s13018-025-05955-1.

Medicine (Baltimore). 2025 Jan 10;104(2):e41059. doi: 10.1097/MD.0000000000041059.

本文引用的文献

Evaluation of Generative Language Models in Personalizing Medical Information: Instrument Validation Study.生成式语言模型在个性化医疗信息方面的评估：工具验证研究

JMIR AI. 2024 Aug 13;3:e54371. doi: 10.2196/54371.

Evaluation of information provided to patients by ChatGPT about chronic diseases in Spanish language.对ChatGPT以西班牙语向患者提供的有关慢性病信息的评估。

Digit Health. 2024 Jan 2;10:20552076231224603. doi: 10.1177/20552076231224603. eCollection 2024 Jan-Dec.

Optimizing Ophthalmology Patient Education via ChatBot-Generated Materials: Readability Analysis of AI-Generated Patient Education Materials and The American Society of Ophthalmic Plastic and Reconstructive Surgery Patient Brochures.通过聊天机器人生成的材料优化眼科患者教育：人工智能生成的患者教育材料和美国眼科整形重建外科学会患者手册的可读性分析。

Ophthalmic Plast Reconstr Surg. 2024;40(2):212-216. doi: 10.1097/IOP.0000000000002549. Epub 2023 Nov 16.

ChatGPT Interactive Medical Simulations for Early Clinical Education: Case Study.用于早期临床教育的ChatGPT交互式医学模拟：案例研究

JMIR Med Educ. 2023 Nov 10;9:e49877. doi: 10.2196/49877.

Evaluating the Readability and Quality of Online Patient Education Materials for Pediatric ACL Tears.评估小儿 ACL 撕裂伤在线患者教育资料的可读性和质量。

J Pediatr Orthop. 2023 Oct 1;43(9):549-554. doi: 10.1097/BPO.0000000000002490. Epub 2023 Aug 7.

Using Artificial Intelligence Chatbots as a Radiologic Decision-Making Tool for Liver Imaging: Do ChatGPT and Bard Communicate Information Consistent With the ACR Appropriateness Criteria?将人工智能聊天机器人用作肝脏影像的放射学决策工具：ChatGPT和Bard传达的信息是否与美国放射学会适宜性标准一致？

J Am Coll Radiol. 2023 Oct;20(10):1010-1013. doi: 10.1016/j.jacr.2023.07.010. Epub 2023 Jul 28.

Readability of spine-related patient education materials: a standard method for improvement.脊柱相关患者教育材料的可读性：一种改进的标准方法。

Eur Spine J. 2023 Sep;32(9):3039-3046. doi: 10.1007/s00586-023-07856-5. Epub 2023 Jul 19.

Assessing ChatGPT Responses to Common Patient Questions Regarding Total Hip Arthroplasty.评估 ChatGPT 对全髋关节置换术常见患者问题的回答。

J Bone Joint Surg Am. 2023 Oct 4;105(19):1519-1526. doi: 10.2106/JBJS.23.00209. Epub 2023 Jul 17.

Evaluating the Effectiveness of Artificial Intelligence-powered Large Language Models Application in Disseminating Appropriate and Readable Health Information in Urology.评估人工智能驱动的大型语言模型在泌尿外科传播恰当且易读的健康信息方面的有效性。

J Urol. 2023 Oct;210(4):688-694. doi: 10.1097/JU.0000000000003615. Epub 2023 Jul 10.

Use of ChatGPT, GPT-4, and Bard to Improve Readability of ChatGPT's Answers to Common Questions About Lung Cancer and Lung Cancer Screening.使用ChatGPT、GPT-4和Bard来提高ChatGPT对肺癌及肺癌筛查常见问题回答的可读性。

AJR Am J Roentgenol. 2023 Nov;221(5):701-704. doi: 10.2214/AJR.23.29622. Epub 2023 Jun 21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

源特征影响基于人工智能的骨科文本简化：对未来的建议。

Source Characteristics Influence AI-Enabled Orthopaedic Text Simplification: Recommendations for the Future.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

CLINICAL RELEVANCE

背景

方法

结果

结论

临床相关性

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献