Suppr超能文献

利用GPT-4评估基于代码的肝硬化及其并发症识别的阳性预测值。

Evaluating the positive predictive value of code-based identification of cirrhosis and its complications utilizing GPT-4.

作者信息

Far Aryana T, Bastani Asal, Lee Albert, Gologorskaya Oksana, Huang Chiung-Yu, Pletcher Mark J, Lai Jennifer C, Ge Jin

机构信息

Department of Medicine, Division of Gastroenterology and Hepatology, University of California-San Francisco, San Francisco, California, USA.

Academic Research Services, University of California-San Francisco, San Francisco, California, USA.

出版信息

Hepatology. 2025 Jun 1;81(6):1753-1763. doi: 10.1097/HEP.0000000000001115. Epub 2024 Oct 8.

Abstract

BACKGROUND AND AIMS

Diagnosis code classification is a common method for cohort identification in cirrhosis research, but it is often inaccurate and augmented by labor-intensive chart review. Natural language processing using large language models (LLMs) is a potentially more accurate method. To assess LLMs' potential for cirrhosis cohort identification, we compared code-based versus LLM-based classification with chart review as a "gold standard."

APPROACH AND RESULTS

We extracted and conducted a limited chart review of 3788 discharge summaries of cirrhosis admissions. We engineered zero-shot prompts using a Generative Pre-trained Transformer 4 to determine whether cirrhosis and its complications were active hospitalization problems. We calculated positive predictive values (PPVs) of LLM-based classification versus limited chart review and PPVs of code-based versus LLM-based classification as a "silver standard" in all 3788 summaries. Compared to gold standard chart review, code-based classification achieved PPVs of 82.2% for identifying cirrhosis, 41.7% for HE, 72.8% for ascites, 59.8% for gastrointestinal bleeding, and 48.8% for spontaneous bacterial peritonitis. Compared to the chart review, Generative Pre-trained Transformer 4 achieved 87.8%-98.8% accuracies for identifying cirrhosis and its complications. Using LLM as a silver standard, code-based classification achieved PPVs of 79.8% for identifying cirrhosis, 53.9% for HE, 55.3% for ascites, 67.6% for gastrointestinal bleeding, and 65.5% for spontaneous bacterial peritonitis.

CONCLUSIONS

LLM-based classification was highly accurate versus manual chart review in identifying cirrhosis and its complications. This allowed us to assess the performance of code-based classification at scale using LLMs as a silver standard. These results suggest LLMs could augment or replace code-based cohort classification and raise questions regarding the necessity of chart review.

摘要

背景与目的

诊断代码分类是肝硬化研究中进行队列识别的常用方法,但该方法往往不准确,且需要耗费大量人力进行病历审查来加以补充。使用大语言模型(LLMs)的自然语言处理是一种可能更准确的方法。为评估大语言模型在肝硬化队列识别方面的潜力,我们将基于代码的分类与基于大语言模型的分类进行了比较,并将病历审查作为“金标准”。

方法与结果

我们提取了3788份肝硬化住院患者的出院小结,并进行了有限的病历审查。我们使用生成式预训练变换器4设计了零样本提示,以确定肝硬化及其并发症是否为当前住院期间的问题。我们计算了基于大语言模型的分类相对于有限病历审查的阳性预测值(PPV),以及基于代码的分类相对于基于大语言模型的分类在所有3788份小结中的PPV,将基于大语言模型的分类作为“银标准”。与金标准病历审查相比,基于代码的分类在识别肝硬化方面的PPV为82.2%,肝性脑病为41.7%,腹水为72.8%,胃肠道出血为59.8%,自发性细菌性腹膜炎为48.8%。与病历审查相比,生成式预训练变换器4在识别肝硬化及其并发症方面的准确率为87.8%-98.8%。以大语言模型作为银标准,基于代码的分类在识别肝硬化方面的PPV为79.8%,肝性脑病为53.9%,腹水为55.3%,胃肠道出血为67.6%,自发性细菌性腹膜炎为65.5%。

结论

在识别肝硬化及其并发症方面,基于大语言模型的分类相对于人工病历审查具有高度准确性。这使我们能够以大语言模型作为银标准来大规模评估基于代码的分类的性能。这些结果表明,大语言模型可以补充或取代基于代码的队列分类,并引发了关于病历审查必要性的问题。

相似文献

5
Inpatient Hepatology Consultation: A Practical Approach for Clinicians.住院部肝病会诊:临床医师实用方法。
Med Clin North Am. 2023 May;107(3):555-565. doi: 10.1016/j.mcna.2023.01.006. Epub 2023 Feb 20.
7

本文引用的文献

5
MELD 3.0: The Model for End-Stage Liver Disease Updated for the Modern Era.MELD 3.0:适应新时代的终末期肝病模型。
Gastroenterology. 2021 Dec;161(6):1887-1895.e4. doi: 10.1053/j.gastro.2021.08.050. Epub 2021 Sep 3.
7
Validity of administrative codes associated with cirrhosis in Sweden.瑞典与肝硬化相关的行政编码的有效性。
Scand J Gastroenterol. 2020 Oct;55(10):1205-1210. doi: 10.1080/00365521.2020.1820566. Epub 2020 Sep 22.
8
Array programming with NumPy.使用 NumPy 进行数组编程。
Nature. 2020 Sep;585(7825):357-362. doi: 10.1038/s41586-020-2649-2. Epub 2020 Sep 16.
9
Transcription Error Rates in Retrospective Chart Reviews.回顾性图表审查中的转录错误率。
Orthopedics. 2020 Sep 1;43(5):e404-e408. doi: 10.3928/01477447-20200619-10. Epub 2020 Jul 7.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验