• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人工智能作为一种提高神经外科文献对患者可读性的方式。

Artificial intelligence as a modality to enhance the readability of neurosurgical literature for patients.

作者信息

Guerra Gage A, Grove Sophie, Le Jonathan, Hofmann Hayden L, Shah Ishan, Bhagavatula Sweta, Fixman Benjamin, Gomez David, Hopkins Benjamin, Dallas Jonathan, Cacciamani Giovanni, Peterson Racheal, Zada Gabriel

机构信息

Departments of1Neurosurgery and.

2Urology, University of Southern California, Los Angeles, California.

出版信息

J Neurosurg. 2024 Nov 8;142(4):1189-1195. doi: 10.3171/2024.6.JNS24617. Print 2025 Apr 1.

DOI:10.3171/2024.6.JNS24617
PMID:39504543
Abstract

OBJECTIVE

In this study the authors assessed the ability of Chat Generative Pretrained Transformer (ChatGPT) 3.5 and ChatGPT4 to generate readable and accurate summaries of published neurosurgical literature.

METHODS

Abstracts published in journal issues released between June 2023 and August 2023 (n = 150) were randomly selected from the top 5 ranked neurosurgical journals according to Google Scholar. ChatGPT models were instructed to generate a readable layperson summary of the original abstract from a statistically validated prompt. Readability results and grade-level indicators (RR-GLIs) scores were calculated for GPT3.5- and GPT4-generated summaries and original abstracts. Two physicians independently rated the accuracy of ChatGPT-generated layperson summaries to assess scientific validity. One-way ANOVA followed by pairwise t-test with Bonferroni correction were performed to compare readability scores. Cohen's kappa was used to assess interrater agreement between the two rater physicians.

RESULTS

Analysis of 150 original abstracts showed a statistically significant difference for all RR-GLIs between the ChatGPT-generated summaries and original abstracts. The readability scores are formatted as follows (original abstract mean, GPT3.5 summary mean, GPT4 summary mean, p value): Flesch-Kincaid reading grade (12.55, 7.80, 7.70, p < 0.0001); Gunning fog score (15.46, 10.00, 9.00, p < 0.0001); Simple Measure of Gobbledygook (SMOG) index (11.30, 7.13, 6.60, p < 0.0001); Coleman-Liau index (14.67, 11.32, 10.26, p < 0.0001); automated readability index (10.87, 8.50, 7.75, p < 0.0001); and Flesch-Kincaid reading ease (33.29, 68.45, 69.55, p < 0.0001). GPT4-generated summaries demonstrated higher RR-GLIs than GPT3.5-generated summaries in the following categories: Gunning fog score (0.0003); SMOG index (0.027); Coleman-Liau index (< 0.0001); sentences (< 0.0001); complex words (< 0.0001); and % complex words (0.0035). A total of 68.4% and 84.2% of GPT3.5- and GPT4-generated summaries, respectively, maintained moderate scientific accuracy according to the two physician-reviewers.

CONCLUSIONS

The findings demonstrate promising potential for application of the ChatGPT in patient education. GPT4 is an accessible tool that can be an immediate solution to enhancing the readability of current neurosurgical literature. Layperson summaries generated by GPT4 would be a valuable addition to a neurosurgical journal and would be likely to improve comprehension for patients using internet resources like PubMed.

摘要

目的

在本研究中,作者评估了聊天生成预训练变换器(ChatGPT)3.5和ChatGPT4生成已发表神经外科文献的可读且准确摘要的能力。

方法

根据谷歌学术排名,从排名前5的神经外科期刊中随机选取2023年6月至2023年8月期间发表的摘要(n = 150)。指导ChatGPT模型根据经过统计验证的提示生成原始摘要的可读性外行人摘要。计算GPT3.5和GPT4生成的摘要以及原始摘要的可读性结果和年级水平指标(RR - GLIs)分数。两名医生独立评估ChatGPT生成的外行人摘要的准确性,以评估科学有效性。进行单因素方差分析,随后进行带有Bonferroni校正的成对t检验以比较可读性分数。使用Cohen's kappa评估两位评分医生之间的评分者间一致性。

结果

对150篇原始摘要的分析表明,ChatGPT生成的摘要与原始摘要之间在所有RR - GLIs上均存在统计学显著差异。可读性分数格式如下(原始摘要均值、GPT3.5摘要均值、GPT4摘要均值、p值):弗莱施 - 金凯德阅读年级(12.55、7.80、7.70,p < 0.0001);冈宁迷雾分数(15.46、10.00、9.00,p < 0.0001);复杂词汇简易衡量指标(SMOG)指数(11.30、7.13、6.60,p < 0.0001);科尔曼 - 廖指数(14.

相似文献

1
Artificial intelligence as a modality to enhance the readability of neurosurgical literature for patients.人工智能作为一种提高神经外科文献对患者可读性的方式。
J Neurosurg. 2024 Nov 8;142(4):1189-1195. doi: 10.3171/2024.6.JNS24617. Print 2025 Apr 1.
2
Bridging the Gap Between Urological Research and Patient Understanding: The Role of Large Language Models in Automated Generation of Layperson's Summaries.弥合泌尿科研究与患者理解之间的差距:大型语言模型在生成非专业人士摘要方面的作用。
Urol Pract. 2023 Sep;10(5):436-443. doi: 10.1097/UPJ.0000000000000428. Epub 2023 Jul 5.
3
American academy of Orthopedic Surgeons' OrthoInfo provides more readable information regarding meniscus injury than ChatGPT-4 while information accuracy is comparable.美国矫形外科医师学会的OrthoInfo在半月板损伤方面提供了比ChatGPT-4更具可读性的信息,而信息准确性相当。
J ISAKOS. 2025 Apr;11:100843. doi: 10.1016/j.jisako.2025.100843. Epub 2025 Feb 21.
4
Accuracy, readability, and understandability of large language models for prostate cancer information to the public.大语言模型向公众提供前列腺癌信息的准确性、可读性和可理解性。
Prostate Cancer Prostatic Dis. 2024 May 14. doi: 10.1038/s41391-024-00826-y.
5
Evaluating Incontinence Abstracts: Artificial Intelligence-Generated Versus Cochrane Review.评估失禁摘要:人工智能生成的摘要与Cochrane系统评价
Urogynecology (Phila). 2025 Apr 8. doi: 10.1097/SPV.0000000000001688.
6
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study.评估人工智能聊天机器人提供的关于化疗心脏毒性的患者教育材料的质量和可读性:一项观察性横断面研究。
Medicine (Baltimore). 2025 Apr 11;104(15):e42135. doi: 10.1097/MD.0000000000042135.
7
Optimizing Ophthalmology Patient Education via ChatBot-Generated Materials: Readability Analysis of AI-Generated Patient Education Materials and The American Society of Ophthalmic Plastic and Reconstructive Surgery Patient Brochures.通过聊天机器人生成的材料优化眼科患者教育:人工智能生成的患者教育材料和美国眼科整形重建外科学会患者手册的可读性分析。
Ophthalmic Plast Reconstr Surg. 2024;40(2):212-216. doi: 10.1097/IOP.0000000000002549. Epub 2023 Nov 16.
8
Both Patients and Plastic Surgeons Prefer Artificial Intelligence-Generated Microsurgical Information.患者和整形外科医生都更喜欢人工智能生成的显微手术信息。
J Reconstr Microsurg. 2024 Nov;40(9):657-664. doi: 10.1055/a-2273-4163. Epub 2024 Feb 21.
9
ChatGPT-4o's performance on pediatric Vesicoureteral reflux.ChatGPT-4o在小儿膀胱输尿管反流方面的表现。
J Pediatr Urol. 2025 Apr;21(2):504-509. doi: 10.1016/j.jpurol.2024.12.002. Epub 2024 Dec 7.
10
Assessment of online patient education materials from major ophthalmologic associations.主要眼科协会在线患者教育材料评估。
JAMA Ophthalmol. 2015 Apr;133(4):449-54. doi: 10.1001/jamaophthalmol.2014.6104.

引用本文的文献

1
ChatGPT-4o Compared With Human Researchers in Writing Plain-Language Summaries for Cochrane Reviews: A Blinded, Randomized Non-Inferiority Controlled Trial.ChatGPT-4o与人类研究人员在为Cochrane系统评价撰写通俗易懂的总结方面的比较:一项双盲、随机非劣效性对照试验。
Cochrane Evid Synth Methods. 2025 Jul 28;3(4):e70037. doi: 10.1002/cesm.70037. eCollection 2025 Jul.
2
Enhancing Patient Comprehension of Glomerular Disease Treatments Using ChatGPT.使用ChatGPT提高患者对肾小球疾病治疗的理解
Healthcare (Basel). 2024 Dec 31;13(1):57. doi: 10.3390/healthcare13010057.