Andalib Saman, Solomon Sean S, Picton Bryce G, Spina Aidin C, Scolaro John A, Nelson Ariana M
University of California, Irvine, School of Medicine, Irvine, California.
Department of Orthopaedic Surgery, University of California, Irvine, Medical Center, Orange, California.
JB JS Open Access. 2025 Jan 8;10(1). doi: 10.2106/JBJS.OA.24.00007. eCollection 2025 Jan-Mar.
BACKGROUND: This study assesses the effectiveness of large language models (LLMs) in simplifying complex language within orthopaedic patient education materials (PEMs) and identifies predictive factors for successful text transformation. METHODS: We transformed 48 orthopaedic PEMs using GPT-4, GPT-3.5, Claude 2, and Llama 2. The readability, quantified by the Flesch-Kincaid Reading Ease (FKRE) and Flesch-Kincaid Grade Level (FKGL) scores, was measured before and after transformation. Analysis included text characteristics such as syllable count, word length, and sentence length. Statistical and machine learning methods evaluated the correlations and predictive capacity of these features for transformation success. RESULTS: All LLMs improved FKRE and FKGL scores (p < 0.01). GPT-4 showed superior performance, transforming PEMs to a seventh-grade reading level (mean FKGL, 6.72 ± 0.99), with higher FKRE and lower FKGL than other models. GPT-3.5, Claude 2, and Llama 2 significantly shortened sentences and overall text length (p < 0.01). Importantly, correlation analysis revealed that transformation success varied substantially with the model used, depending on original text factors such as word length and sentence complexity. CONCLUSIONS: LLMs successfully simplify orthopaedic PEMs, with GPT-4 leading in readability improvement. This study highlights the importance of initial text characteristics in determining the effectiveness of LLM transformations, offering insights for optimizing orthopaedic health literacy initiatives using artificial intelligence (AI). CLINICAL RELEVANCE: This study provides critical insights into the ability of LLMs to simplify complex orthopaedic PEMs, enhancing their readability without compromising informational integrity. By identifying predictive factors for successful text transformation, this research supports the application of AI in improving health literacy, potentially leading to better patient comprehension and outcomes in orthopaedic care.
背景:本研究评估大语言模型(LLMs)在简化骨科患者教育材料(PEMs)中复杂语言方面的有效性,并确定成功进行文本转换的预测因素。 方法:我们使用GPT-4、GPT-3.5、Claude 2和Llama 2对48份骨科PEMs进行了转换。通过弗莱什-金凯德阅读简易度(FKRE)和弗莱什-金凯德年级水平(FKGL)分数对转换前后的可读性进行了测量。分析包括音节数、单词长度和句子长度等文本特征。统计和机器学习方法评估了这些特征与转换成功之间的相关性和预测能力。 结果:所有大语言模型均提高了FKRE和FKGL分数(p < 0.01)。GPT-4表现出卓越的性能,将PEMs转换至七年级阅读水平(平均FKGL,6.72 ± 0.99),其FKRE更高且FKGL低于其他模型。GPT-3.5、Claude 2和Llama 2显著缩短了句子和整体文本长度(p < 0.01)。重要的是,相关性分析显示,转换成功与否因所使用的模型而有很大差异,这取决于单词长度和句子复杂度等原始文本因素。 结论:大语言模型成功简化了骨科PEMs,其中GPT-4在提高可读性方面领先。本研究强调了初始文本特征在确定大语言模型转换有效性方面的重要性,为利用人工智能(AI)优化骨科健康素养举措提供了见解。 临床相关性:本研究为大语言模型简化复杂骨科PEMs的能力提供了关键见解,在不影响信息完整性的情况下提高了其可读性。通过确定成功进行文本转换的预测因素,本研究支持将AI应用于提高健康素养,这可能会改善骨科护理中患者的理解和治疗效果。
JB JS Open Access. 2025-1-8
Int J Med Inform. 2025-3
Medicine (Baltimore). 2025-1-10
Musculoskeletal Care. 2024-3
Medicine (Baltimore). 2025-1-10
JMIR Med Educ. 2023-11-10
J Pediatr Orthop. 2023-10-1
J Bone Joint Surg Am. 2023-10-4