文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

评估人工智能对医学文本的简化:可读性与内容保真度。

Assessing AI Simplification of Medical Texts: Readability and Content Fidelity.

作者信息

Picton Bryce, Andalib Saman, Spina Aidin, Camp Brandon, Solomon Sean S, Liang Jason, Chen Patrick M, Chen Jefferson W, Hsu Frank P, Oh Michael Y

机构信息

Department of Neurological Surgery, University of California, Irvine, Orange, CA, USA.

School of Medicine, University of California, Irvine, Orange, CA, USA.

出版信息

Int J Med Inform. 2025 Mar;195:105743. doi: 10.1016/j.ijmedinf.2024.105743. Epub 2024 Dec 1.


DOI:10.1016/j.ijmedinf.2024.105743
PMID:39667051
Abstract

INTRODUCTION: The escalating complexity of medical literature necessitates tools to enhance readability for patients. This study aimed to evaluate the efficacy of ChatGPT-4 in simplifying neurology and neurosurgical abstracts and patient education materials (PEMs) while assessing content preservation using Latent Semantic Analysis (LSA). METHODS: A total of 100 abstracts (25 each from Neurosurgery, Journal of Neurosurgery, Lancet Neurology, and JAMA Neurology) and 340 PEMs (66 from the American Association of Neurological Surgeons, 274 from the American Academy of Neurology) were transformed by a GPT-4.0 prompt requesting a 5th grade reading level. Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FKRE) scores were used before/after transformation. Content fidelity was validated via LSA (ranging 0-1, 1 meaning identical topics) and by expert assessment (0-1) for a subset (n = 40). Pearson correlation coefficient compared assessments. RESULTS: FKGL decreased from 12th to 5th grade for abstracts and 13th to 5th for PEMs (p < 0.001). FKRE scores showed similar improvement (p < 0.001). LSA confirmed high content similarity for abstracts (mean cosine similarity 0.746) and PEMs (mean 0.953). Expert assessment indicated a mean topic similarity of 0.775 for abstracts and 0.715 for PEMs. The Pearson coefficient between LSA and expert assessment of textual similarity was 0.598 for abstracts and -0.167 for PEMs. Segmented analysis of similarity correlations revealed a correlation of 0.48 (p = 0.02) below 450 words and a -0.20 (p = 0.43) correlation above 450 words. CONCLUSION: GPT-4.0 markedly improved the readability of medical texts, predominantly maintaining content integrity as substantiated by LSA and expert evaluations. LSA emerged as a reliable tool for assessing content fidelity within moderate-length texts, but its utility diminished for longer documents, overestimating similarity. These findings support the potential of AI in combating low health literacy, however, the similarity scores indicate expert validation is crucial. Future research must strive to improve transformation precision and develop validation methodologies.

摘要

引言:医学文献的复杂性不断升级,因此需要一些工具来提高患者对其的可读性。本研究旨在评估ChatGPT-4在简化神经病学和神经外科摘要以及患者教育材料(PEM)方面的效果,同时使用潜在语义分析(LSA)评估内容保留情况。 方法:总共100篇摘要(神经外科、《神经外科杂志》《柳叶刀神经病学》和《美国医学会神经病学杂志》各25篇)和340份PEM(美国神经外科医师协会的66份,美国神经病学学会的274份)通过GPT-4.0提示进行转换,要求达到五年级阅读水平。在转换前后使用弗莱什-金凯德年级水平(FKGL)和弗莱什阅读简易度(FKRE)分数。通过LSA(范围为0至1,1表示主题相同)和对一个子集(n = 40)的专家评估(0至1)来验证内容保真度。使用皮尔逊相关系数比较评估结果。 结果:摘要的FKGL从12年级降至5年级,PEM的从13年级降至5年级(p < 0.001)。FKRE分数显示出类似的改善(p < 0.001)。LSA证实摘要(平均余弦相似度0.746)和PEM(平均0.953)具有高度的内容相似性。专家评估表明摘要的平均主题相似度为0.775,PEM的为0.715。LSA与文本相似度专家评估之间的皮尔逊系数,摘要为0.598,PEM为 -0.167。相似性相关性的分段分析显示,450字以下的相关性为0.48(p = 0.02),450字以上的相关性为 -0.20(p = 0.43)。 结论:GPT-4.0显著提高了医学文本的可读性,主要保持了内容完整性,LSA和专家评估证实了这一点。LSA成为评估中等长度文本内容保真度的可靠工具,但对于更长的文档其效用降低,会高估相似度。这些发现支持了人工智能在应对低健康素养方面的潜力,然而,相似性分数表明专家验证至关重要。未来的研究必须努力提高转换精度并开发验证方法。

相似文献

[1]
Assessing AI Simplification of Medical Texts: Readability and Content Fidelity.

Int J Med Inform. 2025-3

[2]
Tailoring glaucoma education using large language models: Addressing health disparities in patient comprehension.

Medicine (Baltimore). 2025-1-10

[3]
Evaluation of Generative Language Models in Personalizing Medical Information: Instrument Validation Study.

JMIR AI. 2024-8-13

[4]
Source Characteristics Influence AI-Enabled Orthopaedic Text Simplification: Recommendations for the Future.

JB JS Open Access. 2025-1-8

[5]
Can Artificial Intelligence Improve the Readability of Patient Education Materials on Aortic Stenosis? A Pilot Study.

Cardiol Ther. 2024-3

[6]
Assessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study.

JMIR Dermatol. 2024-5-16

[7]
Using Large Language Models to Generate Educational Materials on Childhood Glaucoma.

Am J Ophthalmol. 2024-9

[8]
Optimizing Ophthalmology Patient Education via ChatBot-Generated Materials: Readability Analysis of AI-Generated Patient Education Materials and The American Society of Ophthalmic Plastic and Reconstructive Surgery Patient Brochures.

Ophthalmic Plast Reconstr Surg.

[9]
Assessing parental comprehension of online resources on childhood pain.

Medicine (Baltimore). 2024-6-21

[10]
Unlocking the future of patient Education: ChatGPT vs. LexiComp® as sources of patient education materials.

J Am Pharm Assoc (2003). 2025

引用本文的文献

[1]
ChatGPT-4o Compared With Human Researchers in Writing Plain-Language Summaries for Cochrane Reviews: A Blinded, Randomized Non-Inferiority Controlled Trial.

Cochrane Evid Synth Methods. 2025-7-28

[2]
Using AI to Translate and Simplify Spanish Orthopedic Medical Text: Instrument Validation Study.

JMIR AI. 2025-3-21

[3]
A structured evaluation of LLM-generated step-by-step instructions in cadaveric brachial plexus dissection.

BMC Med Educ. 2025-7-1

[4]
Tailoring glaucoma education using large language models: Addressing health disparities in patient comprehension.

Medicine (Baltimore). 2025-1-10

[5]
Enhancing Patient Comprehension of Glomerular Disease Treatments Using ChatGPT.

Healthcare (Basel). 2024-12-31

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索