人工智能在提高西班牙语骨科患者教育材料的可读性方面成效有限。

Artificial Intelligence Shows Limited Success in Improving Readability Levels of Spanish-language Orthopaedic Patient Education Materials.

作者信息

Busigó Torres Rodnell, Restrepo Mariana, Stern Brocha Z, Yahuaca B Israel, Buerba Rafael A, García Ivan A, Hernandez Victor H, Navarro Ronald A

机构信息

Leni and Peter W. May Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Institute for Health Care Delivery Science, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

出版信息

Clin Orthop Relat Res. 2025 Feb 11. doi: 10.1097/CORR.0000000000003413.

BACKGROUND: The more than 41 million people in the United States who speak Spanish represent one of the fastest-growing US populations. Non-English-speaking patients often face poorer health outcomes because of language barriers that hinder patient education. Orthopaedic education materials have limited availability in Spanish and may be difficult for some patients to read. The American Academy of Orthopaedic Surgeons (AAOS) has translated education materials into Spanish, but their readability levels remain unknown. Additionally, although artificial intelligence (AI) dialogue platforms have been shown to improve readability in English, no studies have specifically evaluated their effectiveness in non-English languages. QUESTIONS/PURPOSES: (1) What is the readability of AAOS Spanish-language education materials? (2) Can an AI dialogue platform improve the readability of Spanish-language education materials while maintaining their accuracy and usefulness? METHODS: After excluding COVID-19 articles and inaccessible websites, Spanish-language education materials were extracted from the AAOS OrthoInfo website, and their Fernández-Huerta and Spanish Orthographic Length (SOL) readability grade levels were calculated. Fernández-Huerta focuses on syntactic complexity (sentence and syllable structure) and SOL assesses lexical complexity (word length and frequency). For both, the higher the grade level, the harder it is to read. Education materials with a reading level above the sixth-grade level were inputted into the ChatGPT-4 AI platform to be adapted to a fifth-grade level. Readability metrics of the adaptations were reassessed and compared with the original versions. Secondarily, one of four Spanish-speaking orthopaedic surgeons evaluated each AI-adapted education material for accuracy and usefulness compared with the original version. We used a single review per material, trusting the orthopaedic surgeon's expertise to minimize discrepancies. We included a total of 77 of 82 education materials covering topics like diseases and conditions, treatment, and recovery and staying healthy. RESULTS: Before AI adaptations, none of the 77 education materials met the recommended reading level of sixth grade or below according to both readability formulas. The original education materials were written at a seventh- to eighth-grade reading level in 32% of cases (25 of 77). In comparison, after a single attempt at simplification, AI-adapted materials achieved this reading level in 53% of cases (41 of 77; p < 0.001). Only 23% (18) and 16% (12) of the AI adaptations were written at or below the recommended sixth-grade level per the Fernández-Huerta and SOL grade levels, respectively. Of the AI adaptations, 52% (40) were rated as accurate and 56% (43) were rated as useful for patient education by the evaluating orthopaedic surgeons. AI adaptations that were classified as accurate or useful had a higher median (IQR) word count than those that were inaccurate (accurate 255 [216 to 331] versus inaccurate 236 [209 to 256]; p = 0.04) or not useful (useful 257 [216 to 337] versus not useful 233 [209 to 251]; p = 0.01). CONCLUSION: Ongoing attention is needed to improve the readability of Spanish education materials to reduce health disparities. ChatGPT-4 has limited success in improving readability without compromising accuracy and usefulness. We urge AAOS to enhance the readability of these materials and recommend physicians use them as supplemental resources while prioritizing direct patient education for Spanish-speaking individuals. Further research is needed to develop readable and culturally appropriate education materials for non-English-speaking patients that incorporate direct patient feedback. CLINICAL RELEVANCE: This study shows that Spanish-language orthopaedic materials often exceed recommended readability levels, limiting their effectiveness and worsening health disparities. While AI tools like ChatGPT-4 improve readability, they may fall short in accuracy and usefulness. This underscores the need for clearer, culturally appropriate materials and the importance of physicians providing direct education.

背景：在美国，超过4100万讲西班牙语的人是增长最快的群体之一。由于语言障碍阻碍了患者教育，非英语患者往往面临更差的健康结果。骨科教育材料的西班牙语版本有限，而且有些患者可能难以阅读。美国骨科医师学会（AAOS）已将教育材料翻译成西班牙语，但其可读性水平仍不为人知。此外，虽然人工智能（AI）对话平台已被证明能提高英语的可读性，但尚无研究专门评估其在非英语语言中的有效性。问题/目的：（1）AAOS西班牙语教育材料的可读性如何？（2）人工智能对话平台能否在保持准确性和实用性的同时提高西班牙语教育材料的可读性？方法：在排除关于2019冠状病毒病的文章和无法访问的网站后，从AAOS OrthoInfo网站提取西班牙语教育材料，并计算其费尔南德斯 - 韦尔塔（Fernández - Huerta）和西班牙语正字法长度（SOL）可读性等级水平。费尔南德斯 - 韦尔塔关注句法复杂性（句子和音节结构），而SOL评估词汇复杂性（单词长度和频率）。对于这两种方法，等级水平越高，阅读难度越大。阅读水平高于六年级的教育材料被输入ChatGPT - 4人工智能平台，以使其适应五年级水平。重新评估改编后的可读性指标，并与原始版本进行比较。其次，四位讲西班牙语的骨科医生之一评估每一份人工智能改编的教育材料与原始版本相比的准确性和实用性。我们对每份材料进行一次评估，相信骨科医生的专业知识以尽量减少差异。我们总共纳入了82份教育材料中的77份，涵盖疾病与状况、治疗以及康复与保持健康等主题。结果：在进行人工智能改编之前，根据两种可读性公式，77份教育材料中没有一份达到推荐的六年级或以下阅读水平。在32%的情况下（77份中的25份），原始教育材料的写作水平为七年级至八年级。相比之下，经过一次简化尝试后，人工智能改编的材料在53%的情况下（77份中的41份）达到了这个阅读水平（p < 0.001）。根据费尔南德斯 - 韦尔塔和SOL等级水平，分别只有23%（18份）和16%（12份）的人工智能改编材料的写作水平在推荐的六年级或以下。在人工智能改编材料中，52%（40份）被评估为准确，56%（43份）被评估为对患者教育有用。被归类为准确或有用

新学期，新优惠

Suppr 超能文献

新学期，新优惠

Suppr 超能文献

Artificial Intelligence Shows Limited Success in Improving Readability Levels of Spanish-language Orthopaedic Patient Education Materials.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

推荐工具