评估 ChatGPT 在回答肝硬化和肝细胞癌相关问题方面的表现。

Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma.

机构信息

Karsh Division of Gastroenterology and Hepatology, Department of Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA.

Bristol Medical School, University of Bristol, Bristol, UK.

出版信息

Clin Mol Hepatol. 2023 Jul;29(3):721-732. doi: 10.3350/cmh.2023.0089. Epub 2023 Mar 22.

DOI:10.3350/cmh.2023.0089

PMID:36946005

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10366809/

Abstract

BACKGROUND/AIMS: Patients with cirrhosis and hepatocellular carcinoma (HCC) require extensive and personalized care to improve outcomes. ChatGPT (Generative Pre-trained Transformer), a large language model, holds the potential to provide professional yet patient-friendly support. We aimed to examine the accuracy and reproducibility of ChatGPT in answering questions regarding knowledge, management, and emotional support for cirrhosis and HCC.

METHODS

ChatGPT's responses to 164 questions were independently graded by two transplant hepatologists and resolved by a third reviewer. The performance of ChatGPT was also assessed using two published questionnaires and 26 questions formulated from the quality measures of cirrhosis management. Finally, its emotional support capacity was tested.

RESULTS

We showed that ChatGPT regurgitated extensive knowledge of cirrhosis (79.1% correct) and HCC (74.0% correct), but only small proportions (47.3% in cirrhosis, 41.1% in HCC) were labeled as comprehensive. The performance was better in basic knowledge, lifestyle, and treatment than in the domains of diagnosis and preventive medicine. For the quality measures, the model answered 76.9% of questions correctly but failed to specify decision-making cut-offs and treatment durations. ChatGPT lacked knowledge of regional guidelines variations, such as HCC screening criteria. However, it provided practical and multifaceted advice to patients and caregivers regarding the next steps and adjusting to a new diagnosis.

CONCLUSION

We analyzed the areas of robustness and limitations of ChatGPT's responses on the management of cirrhosis and HCC and relevant emotional support. ChatGPT may have a role as an adjunct informational tool for patients and physicians to improve outcomes.

摘要

背景/目的：肝硬化和肝细胞癌 (HCC) 患者需要广泛且个性化的护理，以改善治疗效果。ChatGPT（生成式预训练转换器）是一种大型语言模型，具有为患者提供专业支持的潜力。我们旨在研究 ChatGPT 在回答有关肝硬化和 HCC 的知识、管理和情感支持问题方面的准确性和可重复性。

方法

两名移植肝病学家对 ChatGPT 对 164 个问题的回答进行了独立评分，并由第三位审稿人解决。还使用两份已发表的问卷和 26 个来自肝硬化管理质量措施的问题评估了 ChatGPT 的性能。最后，测试了它的情感支持能力。

结果

我们表明，ChatGPT 大量复述了肝硬化（79.1%正确）和 HCC（74.0%正确）的知识，但只有小部分（肝硬化中为 47.3%，HCC 中为 41.1%）被标记为全面。其在基础知识、生活方式和治疗方面的表现优于诊断和预防医学领域。对于质量措施，该模型正确回答了 76.9%的问题，但未能指定决策截止日期和治疗持续时间。ChatGPT 缺乏有关区域指南变化的知识，例如 HCC 筛查标准。然而，它为患者和护理人员提供了有关下一步和适应新诊断的实际和多方面的建议。

结论

我们分析了 ChatGPT 在肝硬化和 HCC 管理及其相关情感支持方面的稳健性和局限性。ChatGPT 可以作为患者和医生的辅助信息工具，以改善治疗效果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d122/10366809/e6c1fdafc34b/cmh-2023-0089f1.jpg

相似文献

Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma.评估 ChatGPT 在回答肝硬化和肝细胞癌相关问题方面的表现。

Clin Mol Hepatol. 2023 Jul;29(3):721-732. doi: 10.3350/cmh.2023.0089. Epub 2023 Mar 22.

Assessing the accuracy and reliability of ChatGPT's medical responses about thyroid cancer.评估 ChatGPT 对甲状腺癌相关医疗回复的准确性和可靠性。

Int J Med Inform. 2024 Nov;191:105593. doi: 10.1016/j.ijmedinf.2024.105593. Epub 2024 Aug 13.

ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.ChatGPT在德国妇产科考试中的表现——为人工智能强化医学教育和临床实践铺平道路。

Front Med (Lausanne). 2023 Dec 13;10:1296615. doi: 10.3389/fmed.2023.1296615. eCollection 2023.

Performance of ChatGPT on the Chinese Postgraduate Examination for Clinical Medicine: Survey Study.ChatGPT 在临床医学研究生入学考试中的表现：调查研究。

JMIR Med Educ. 2024 Feb 9;10:e48514. doi: 10.2196/48514.

A Multidisciplinary Assessment of ChatGPT's Knowledge of Amyloidosis: Observational Study.对ChatGPT关于淀粉样变性知识的多学科评估：观察性研究。

JMIR Cardio. 2024 Apr 19;8:e53421. doi: 10.2196/53421.

Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study.评估 ChatGPT 在整个临床工作流程中的效用：开发和可用性研究。

J Med Internet Res. 2023 Aug 22;25:e48659. doi: 10.2196/48659.

Performance of ChatGPT as an AI-assisted decision support tool in medicine: a proof-of-concept study for interpreting symptoms and management of common cardiac conditions (AMSTELHEART-2).ChatGPT 在医学中作为 AI 辅助决策支持工具的性能：解释常见心脏疾病症状和管理的概念验证研究 (AMSTELHEART-2)。

Acta Cardiol. 2024 May;79(3):358-366. doi: 10.1080/00015385.2024.2303528. Epub 2024 Feb 13.

How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.ChatGPT在美国医师执照考试（USMLE）中的表现如何？大语言模型对医学教育和知识评估的影响。

JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.

Is ChatGPT accurate and reliable in answering questions regarding head and neck cancer?ChatGPT在回答有关头颈癌的问题时准确可靠吗？

Front Oncol. 2023 Dec 1;13:1256459. doi: 10.3389/fonc.2023.1256459. eCollection 2023.

ChatGPT's Performance on the Hand Surgery Self-Assessment Exam: A Critical Analysis.ChatGPT在手外科自我评估考试中的表现：一项批判性分析。

J Hand Surg Glob Online. 2024 Jan 2;6(2):200-205. doi: 10.1016/j.jhsg.2023.11.014. eCollection 2024 Mar.

引用本文的文献

Advantages and Limitations of ChatGPT in Healthcare: A Scoping Review.ChatGPT在医疗保健领域的优势与局限：一项范围综述

Health Sci Rep. 2025 Sep 11;8(9):e71219. doi: 10.1002/hsr2.71219. eCollection 2025 Sep.

Performance of large language models on veterinary undergraduate multiple-choice examinations: a comparative evaluation.大型语言模型在兽医本科多项选择题考试中的表现：一项比较评估。

Front Vet Sci. 2025 Aug 26;12:1616566. doi: 10.3389/fvets.2025.1616566. eCollection 2025.

Evaluation of the accuracy of ChatGPT in answering asthma-related questions.评估ChatGPT回答哮喘相关问题的准确性。

J Bras Pneumol. 2025 Sep 8;51(3):e20240388. doi: 10.36416/1806-3756/e20240388. eCollection 2025.

Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media.心理健康数据集上的监督学习与大语言模型基准：中国社交媒体中的认知扭曲与自杀风险

Bioengineering (Basel). 2025 Aug 19;12(8):882. doi: 10.3390/bioengineering12080882.

Evaluating the Quality and Understandability of Radiology Report Summaries Generated by ChatGPT: Survey Study.评估ChatGPT生成的放射学报告摘要的质量和可理解性：调查研究

JMIR Form Res. 2025 Aug 27;9:e76097. doi: 10.2196/76097.

Identification and Categorization of the Top 100 Articles and the Future of Large Language Models: Thematic Analysis Using Bibliometric Analysis.100篇顶级文章的识别与分类以及大语言模型的未来：基于文献计量分析的主题分析

JMIR AI. 2025 Aug 27;4:e68603. doi: 10.2196/68603.

GastroGPT: Development and controlled testing of a proof-of-concept customized clinical language model.胃语大模型：一种概念验证型定制临床语言模型的开发与对照测试

Endosc Int Open. 2025 Aug 6;13:a26372163. doi: 10.1055/a-2637-2163. eCollection 2025.

A bibliometric analysis of large language model-based AI chatbots in surgery.基于大语言模型的人工智能聊天机器人在外科手术中的文献计量分析

Ann Med Surg (Lond). 2025 May 12;87(7):4127-4138. doi: 10.1097/MS9.0000000000003234. eCollection 2025 Jul.

Large language models for clinical decision support in gastroenterology and hepatology.用于胃肠病学和肝病学临床决策支持的大语言模型

Nat Rev Gastroenterol Hepatol. 2025 Aug 22. doi: 10.1038/s41575-025-01108-1.

Assessing the Accuracy and Readability of Large Language Model Guidance for Patients on Breast Cancer Surgery Preparation and Recovery.评估大型语言模型为患者提供的乳腺癌手术准备和康复指导的准确性和可读性。

J Clin Med. 2025 Aug 1;14(15):5411. doi: 10.3390/jcm14155411.

本文引用的文献

ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports.ChatGPT 让医学文献通俗易懂：简化放射学报告的探索性案例研究。

Eur Radiol. 2024 May;34(5):2817-2825. doi: 10.1007/s00330-023-10213-1. Epub 2023 Oct 5.

JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.

Global burden of primary liver cancer in 2020 and predictions to 2040.2020 年全球原发性肝癌负担及 2040 年预测。

J Hepatol. 2022 Dec;77(6):1598-1606. doi: 10.1016/j.jhep.2022.08.021. Epub 2022 Oct 5.

Poor disease knowledge is associated with higher healthcare service use and costs among patients with cirrhosis: an exploratory study.疾病知识匮乏与肝硬化患者的医疗服务利用和费用增加相关：一项探索性研究。

BMC Gastroenterol. 2022 Jul 14;22(1):340. doi: 10.1186/s12876-022-02407-6.

Liver cirrhosis.肝硬化。

Lancet. 2021 Oct 9;398(10308):1359-1376. doi: 10.1016/S0140-6736(21)01374-X. Epub 2021 Sep 17.

New advances in the diagnosis and management of hepatocellular carcinoma.肝细胞癌的诊断与治疗新进展。

BMJ. 2020 Oct 26;371:m3544. doi: 10.1136/bmj.m3544.

Effectiveness of patient-oriented education and medication management intervention in people with decompensated cirrhosis.以患者为中心的教育和药物管理干预对失代偿期肝硬化患者的效果。

Intern Med J. 2020 Sep;50(9):1142-1146. doi: 10.1111/imj.14986.

Health Literacy Gaps in Online Resources for Cirrhotic Patients.肝硬化患者在线资源中的健康素养差距

J Curr Surg. 2020 Apr;10(1-2):1-6. doi: 10.14740/jcs401.

Epidemiology of Hepatocellular Carcinoma.肝细胞癌的流行病学

Hepatology. 2021 Jan;73 Suppl 1(Suppl 1):4-13. doi: 10.1002/hep.31288. Epub 2020 Nov 24.

The global, regional, and national burden of cirrhosis by cause in 195 countries and territories, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017.2017 年全球疾病负担研究：1990-2017 年 195 个国家和地区按病因划分的肝硬化全球、区域和国家负担：系统分析。

Lancet Gastroenterol Hepatol. 2020 Mar;5(3):245-266. doi: 10.1016/S2468-1253(19)30349-8. Epub 2020 Jan 22.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

评估 ChatGPT 在回答肝硬化和肝细胞癌相关问题方面的表现。

Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma.

机构信息

出版信息

METHODS

RESULTS

CONCLUSION

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献