Nian Patrick P, Williams Christopher J, Senthilnathan Ithika S, Marsh Isabella G, Jones Ruth H, Palandjian Pari L, Heyer Jessica H, Doyle Shevaun M
From the Hospital for Special Surgery (Nian, Williams, Senthilnathan, Marsh, Jones, Palandjian, Heyer, Doyle), New York City, NY.
J Am Acad Orthop Surg. 2025 Jun 24. doi: 10.5435/JAAOS-D-25-00267.
Patient education materials (PEMs) consistently exceed the recommended sixth-grade reading level. Poor health literacy and limited English proficiency, particularly in more than 40 million Spanish speakers, is associated with adverse patient outcomes. The use case of artificial intelligence (AI) to improve readability has rarely been validated in Spanish PEMs or in pediatric orthopaedic PEMs. This study aimed to (1) assess the availability and readability of English and Spanish pediatric orthopaedic PEMs and (2) compare the efficacy of ChatGPT-4.0 and Google Gemini to improve readability.
Pediatric orthopaedic PEMs were collected from 13 websites of pediatric orthopaedic hospitals and societies. Grade levels were assessed using the Flesch-Kincaid Grade-Level (FKGL) and Gunning Fog Index (GFI) for English articles and FKGL and Spanish Simple Measure of Gobbledygook (SMOG) for Spanish articles. English and Spanish PEMs were additionally assessed using Flesch Reading Ease (FRE) and Fernandez-Huerta Index (FHI), respectively. ChatGPT-4.0 and Google Gemini were prompted to rewrite article text at a sixth-grade level. AI-converted readability was compared categorically by proportion of articles ≤sixth-grade level and continuously through all metrics.
Of 103 English articles, 40 (38.8%) were available in Spanish. Baseline readability ≤sixth FKGL was low for English (5.8%) and Spanish (10.0%) articles. 21.4% and 60.2% of ChatGPT-4.0-converted and Google Gemini-converted English PEMs achieved ≤sixth FKGL, respectively. 52.5% and 77.5% of ChatGPT-4.0-converted and Google Gemini-converted Spanish PEMs achieved ≤sixth FKGL, respectively. Google Gemini had greater absolute improvements in GFI, English FKGL, and Spanish SMOG, and a higher proportion of articles ≤ sixth-grade level (GFI, FKGL, Spanish SMOG) compared with ChatGPT-4.0 (all, P < 0.05).
Pediatric orthopaedic PEMs are limited by complex readability and low availability of Spanish PEMs. Medical societies/hospitals may use AI models, particularly Google Gemini, to improve readability and patient comprehension, but increasing accessibility to Spanish PEMs is also necessary.
患者教育材料(PEMs)一直超过推荐的六年级阅读水平。健康素养差和英语水平有限,尤其是在超过4000万讲西班牙语的人群中,与不良的患者预后相关。人工智能(AI)用于提高可读性的用例在西班牙语PEMs或儿科骨科PEMs中很少得到验证。本研究旨在:(1)评估英语和西班牙语儿科骨科PEMs的可用性和可读性;(2)比较ChatGPT-4.0和谷歌Gemini在提高可读性方面的效果。
从13个儿科骨科医院和协会的网站收集儿科骨科PEMs。使用弗莱什-金凯德年级水平(FKGL)和冈宁雾度指数(GFI)评估英语文章的年级水平,使用FKGL和西班牙语胡言乱语简易测量法(SMOG)评估西班牙语文章的年级水平。此外,分别使用弗莱什阅读简易度(FRE)和费尔南德斯-韦尔塔指数(FHI)评估英语和西班牙语PEMs。要求ChatGPT-4.0和谷歌Gemini将文章文本改写为六年级水平。通过≤六年级水平文章的比例进行分类比较,并通过所有指标进行连续比较AI转换后的可读性。
在103篇英语文章中,40篇(38.8%)有西班牙语版本。英语(5.8%)和西班牙语(10.0%)文章的基线可读性≤六年级FKGL较低。ChatGPT-4.0转换和谷歌Gemini转换的英语PEMs分别有21.4%和60.2%达到≤六年级FKGL。ChatGPT-4.0转换和谷歌Gemini转换的西班牙语PEMs分别有52.5%和77.5%达到≤六年级FKGL。与ChatGPT-4.0相比,谷歌Gemini在GFI、英语FKGL和西班牙语SMOG方面的绝对改善更大,且≤六年级水平的文章比例更高(GFI、FKGL、西班牙语SMOG,均P<0.05)。
儿科骨科PEMs受到复杂可读性和西班牙语PEMs可用性低的限制。医学协会/医院可以使用AI模型,尤其是谷歌Gemini,来提高可读性和患者理解能力,但增加西班牙语PEMs的可及性也很有必要。