• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估关于活体肾捐赠的人工智能生成信息的质量和可读性。

Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation.

作者信息

Villani Vincenzo, Nguyen Hong-Hanh T, Shanmugarajah Kumaran

机构信息

Division of Immunology and Organ Transplantation, McGovern Medical School at UTHealth Houston, Houston, TX.

Liver Specialists of Texas, Houston, TX.

出版信息

Transplant Direct. 2024 Dec 10;11(1):e1740. doi: 10.1097/TXD.0000000000001740. eCollection 2025 Jan.

DOI:10.1097/TXD.0000000000001740
PMID:39668891
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11634323/
Abstract

BACKGROUND

The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence-based large language models (LLMs) have recently become common instruments for acquiring information online, including medical information. The aim of this study was to assess the quality and readability of artificial intelligence-generated information on kidney donation.

METHODS

A set of 35 common donor questions was developed by the authors and used to interrogate 3 LLMs (ChatGPT, Google Gemini, and MedGPT). Answers were collected and independently evaluated using the CLEAR tool for (1) completeness, (2) lack of false information, (3) evidence-based information, (4) appropriateness, and (5) relevance. Readability was evaluated using the Flesch-Kincaid Reading Ease Score and the Flesch-Kincaid Grade Level.

RESULTS

The interrater intraclass correlation was 0.784 (95% confidence interval, 0.716-0.814). Median CLEAR scores were ChatGPT 22 (interquartile range [IQR], 3.67), Google Gemini 24.33 (IQR, 2.33), and MedGPT 23.33 (IQR, 2.00). ChatGPT, Gemini, and MedGPT had mean Flesch-Kincaid Reading Ease Scores of 37.32 (SD = 10.00), 39.42 (SD = 13.49), and 29.66 (SD = 7.94), respectively. Using the Flesch-Kincaid Grade Level assessment, ChatGPT had an average score of 12.29, Gemini had 10.63, and MedGPT had 13.21 ( < 0.001), indicating that all LLMs had a readability at the college-level education.

CONCLUSIONS

Current LLM provides fairly accurate responses to common prospective living kidney donor questions; however, the generated information is complex and requires an advanced level of education. As LLMs become more relevant in the field of medical information, transplant providers should familiarize themselves with the shortcomings of these technologies.

摘要

背景

提供高质量且易于阅读的信息材料对于向潜在的肾脏捐赠者提供准确信息至关重要。该信息的质量与进行活体捐赠的可能性相关。基于人工智能的大语言模型(LLMs)最近已成为在线获取信息(包括医学信息)的常用工具。本研究的目的是评估人工智能生成的关于肾脏捐赠信息的质量和可读性。

方法

作者制定了一组35个常见的捐赠者问题,并用于询问3个大语言模型(ChatGPT、谷歌Gemini和医脉通GPT)。收集答案并使用CLEAR工具独立评估(1)完整性,(2)无虚假信息,(3)基于证据的信息,(4)适当性,以及(5)相关性。使用弗莱什-金凯德易读性评分和弗莱什-金凯德年级水平评估可读性。

结果

评分者间组内相关性为0.784(95%置信区间,0.716 - 0.814)。CLEAR评分中位数分别为ChatGPT 22(四分位间距[IQR],3.67)、谷歌Gemini 24.33(IQR,2.33)和医脉通GPT 23.33(IQR,2.00)。ChatGPT、Gemini和医脉通GPT的平均弗莱什-金凯德易读性评分分别为37.32(标准差 = 10.00)、39.42(标准差 = 13.49)和29.66(标准差 = 7.94)。使用弗莱什-金凯德年级水平评估,ChatGPT的平均评分为12.29,Gemini为10.63,医脉通GPT为13.21(<0.001),表明所有大语言模型的可读性都处于大学教育水平。

结论

当前的大语言模型对常见的潜在活体肾脏捐赠者问题提供了相当准确的回答;然而,生成的信息很复杂,需要高等教育水平。随着大语言模型在医学信息领域变得更加重要,移植提供者应熟悉这些技术的缺点。

相似文献

1
Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation.评估关于活体肾捐赠的人工智能生成信息的质量和可读性。
Transplant Direct. 2024 Dec 10;11(1):e1740. doi: 10.1097/TXD.0000000000001740. eCollection 2025 Jan.
2
Assessing the Readability of Patient Education Materials on Cardiac Catheterization From Artificial Intelligence Chatbots: An Observational Cross-Sectional Study.评估人工智能聊天机器人提供的心脏导管插入术患者教育材料的可读性:一项观察性横断面研究。
Cureus. 2024 Jul 4;16(7):e63865. doi: 10.7759/cureus.63865. eCollection 2024 Jul.
3
The impact of internet resources and artificial intelligence on information on myringotomy tubes.互联网资源和人工智能对鼓膜切开术管相关信息的影响
Eur Arch Otorhinolaryngol. 2025 Apr;282(4):2149-2153. doi: 10.1007/s00405-024-09148-0. Epub 2024 Dec 12.
4
Microsoft Copilot Provides More Accurate and Reliable Information About Anterior Cruciate Ligament Injury and Repair Than ChatGPT and Google Gemini; However, No Resource Was Overall the Best.与ChatGPT和谷歌Gemini相比,微软Copilot能提供关于前交叉韧带损伤与修复的更准确、更可靠的信息;然而,没有一种资源在各方面都是最佳的。
Arthrosc Sports Med Rehabil. 2024 Nov 19;7(2):101043. doi: 10.1016/j.asmr.2024.101043. eCollection 2025 Apr.
5
Can popular AI large language models provide reliable answers to frequently asked questions about rotator cuff tears?流行的人工智能大语言模型能否为有关肩袖撕裂的常见问题提供可靠答案?
JSES Int. 2024 Nov 29;9(2):390-397. doi: 10.1016/j.jseint.2024.11.012. eCollection 2025 Mar.
6
Evaluating the Efficacy of ChatGPT as a Patient Education Tool in Prostate Cancer: Multimetric Assessment.评估 ChatGPT 在前列腺癌患者教育中的疗效:多指标评估。
J Med Internet Res. 2024 Aug 14;26:e55939. doi: 10.2196/55939.
7
Readability, accuracy and appropriateness and quality of AI chatbot responses as a patient information source on root canal retreatment: A comparative assessment.作为根管再治疗患者信息来源的人工智能聊天机器人回复的可读性、准确性、恰当性和质量:一项比较评估。
Int J Med Inform. 2025 Sep;201:105948. doi: 10.1016/j.ijmedinf.2025.105948. Epub 2025 Apr 25.
8
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study.评估人工智能聊天机器人提供的关于化疗心脏毒性的患者教育材料的质量和可读性:一项观察性横断面研究。
Medicine (Baltimore). 2025 Apr 11;104(15):e42135. doi: 10.1097/MD.0000000000042135.
9
Assessing the Responses of Large Language Models (ChatGPT-4, Gemini, and Microsoft Copilot) to Frequently Asked Questions in Breast Imaging: A Study on Readability and Accuracy.评估大语言模型(ChatGPT-4、Gemini和Microsoft Copilot)对乳腺成像常见问题的回答:可读性和准确性研究
Cureus. 2024 May 9;16(5):e59960. doi: 10.7759/cureus.59960. eCollection 2024 May.
10
Appropriateness and readability of Google Bard and ChatGPT-3.5 generated responses for surgical treatment of glaucoma.谷歌巴德和 ChatGPT-3.5 生成的青光眼手术治疗回复的适宜性和可读性。
Rom J Ophthalmol. 2024 Jul-Sep;68(3):243-248. doi: 10.22336/rjo.2024.45.

引用本文的文献

1
Reshaping transplantation with AI, emerging technologies and xenotransplantation.利用人工智能、新兴技术和异种移植重塑移植领域。
Nat Med. 2025 Jul 14. doi: 10.1038/s41591-025-03801-9.
2
A structured evaluation of LLM-generated step-by-step instructions in cadaveric brachial plexus dissection.对大语言模型生成的尸体臂丛神经解剖分步指导的结构化评估。
BMC Med Educ. 2025 Jul 1;25(1):903. doi: 10.1186/s12909-025-07493-0.

本文引用的文献

1
Can generative AI improve the readability of patient education materials at a radiology practice?生成式人工智能能否提高放射科实践中患者教育材料的可读性?
Clin Radiol. 2024 Nov;79(11):e1366-e1371. doi: 10.1016/j.crad.2024.08.019. Epub 2024 Aug 22.
2
Enhancing Readability of Online Patient-Facing Content: The Role of AI Chatbots in Improving Cancer Information Accessibility.提高在线面向患者内容的可读性:人工智能聊天机器人在改善癌症信息可及性方面的作用。
J Natl Compr Canc Netw. 2024 May 15;22(2 D):e237334. doi: 10.6004/jnccn.2023.7334.
3
Biomedical text readability after hypernym substitution with fine-tuned large language models.
使用微调大语言模型进行上位词替换后的生物医学文本可读性
PLOS Digit Health. 2024 Apr 16;3(4):e0000489. doi: 10.1371/journal.pdig.0000489. eCollection 2024 Apr.
4
National Attitudes Toward Living Kidney Donation in the United States: Results of a Public Opinion Survey.美国民众对活体肾捐赠的态度:一项民意调查结果
Kidney Med. 2023 Dec 27;6(3):100788. doi: 10.1016/j.xkme.2023.100788. eCollection 2024 Mar.
5
Pilot Testing of a Tool to Standardize the Assessment of the Quality of Health Information Generated by Artificial Intelligence-Based Models.用于规范基于人工智能模型生成的健康信息质量评估工具的试点测试。
Cureus. 2023 Nov 24;15(11):e49373. doi: 10.7759/cureus.49373. eCollection 2023 Nov.
6
Can Artificial Intelligence Improve the Readability of Patient Education Materials?人工智能能否提高患者教育材料的可读性?
Clin Orthop Relat Res. 2023 Nov 1;481(11):2260-2267. doi: 10.1097/CORR.0000000000002668. Epub 2023 Apr 28.
7
Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum.比较医生和人工智能聊天机器人对发布在公共社交媒体论坛上的患者问题的回复。
JAMA Intern Med. 2023 Jun 1;183(6):589-596. doi: 10.1001/jamainternmed.2023.1838.
8
Foundation models for generalist medical artificial intelligence.通用型医学人工智能的基础模型。
Nature. 2023 Apr;616(7956):259-265. doi: 10.1038/s41586-023-05881-4. Epub 2023 Apr 12.
9
Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models.ChatGPT在美国医师执照考试中的表现:使用大语言模型进行人工智能辅助医学教育的潜力。
PLOS Digit Health. 2023 Feb 9;2(2):e0000198. doi: 10.1371/journal.pdig.0000198. eCollection 2023 Feb.
10
Informing American Muslims about living donation through tailored health education: A randomized controlled crossover trial evaluating increase in biomedical and religious knowledge.通过量身定制的健康教育告知美国穆斯林有关活体捐赠的信息:一项评估生物医学和宗教知识增长的随机对照交叉试验。
Am J Transplant. 2021 Mar;21(3):1227-1237. doi: 10.1111/ajt.16242. Epub 2020 Sep 15.