Suppr超能文献

人工智能实现健康经济建模自动化:评估大语言模型潜在应用的案例研究

Artificial Intelligence to Automate Health Economic Modelling: A Case Study to Evaluate the Potential Application of Large Language Models.

作者信息

Reason Tim, Rawlinson William, Langham Julia, Gimblett Andy, Malcolm Bill, Klijn Sven

机构信息

Estima Scientific, Mediaworks, 191 Wood Ln, London, W12 7FP, UK.

Bristol Myers Squibb, Uxbridge, UK.

出版信息

Pharmacoecon Open. 2024 Mar;8(2):191-203. doi: 10.1007/s41669-024-00477-8. Epub 2024 Feb 10.

Abstract

BACKGROUND

Current generation large language models (LLMs) such as Generative Pre-Trained Transformer 4 (GPT-4) have achieved human-level performance on many tasks including the generation of computer code based on textual input. This study aimed to assess whether GPT-4 could be used to automatically programme two published health economic analyses.

METHODS

The two analyses were partitioned survival models evaluating interventions in non-small cell lung cancer (NSCLC) and renal cell carcinoma (RCC). We developed prompts which instructed GPT-4 to programme the NSCLC and RCC models in R, and which provided descriptions of each model's methods, assumptions and parameter values. The results of the generated scripts were compared to the published values from the original, human-programmed models. The models were replicated 15 times to capture variability in GPT-4's output.

RESULTS

GPT-4 fully replicated the NSCLC model with high accuracy: 100% (15/15) of the artificial intelligence (AI)-generated NSCLC models were error-free or contained a single minor error, and 93% (14/15) were completely error-free. GPT-4 closely replicated the RCC model, although human intervention was required to simplify an element of the model design (one of the model's fifteen input calculations) because it used too many sequential steps to be implemented in a single prompt. With this simplification, 87% (13/15) of the AI-generated RCC models were error-free or contained a single minor error, and 60% (9/15) were completely error-free. Error-free model scripts replicated the published incremental cost-effectiveness ratios to within 1%.

CONCLUSION

This study provides a promising indication that GPT-4 can have practical applications in the automation of health economic model construction. Potential benefits include accelerated model development timelines and reduced costs of development. Further research is necessary to explore the generalisability of LLM-based automation across a larger sample of models.

摘要

背景

当前一代的大型语言模型(LLMs),如生成式预训练变换器4(GPT-4),在许多任务上都达到了人类水平的表现,包括根据文本输入生成计算机代码。本研究旨在评估GPT-4是否可用于自动编写两项已发表的卫生经济分析程序。

方法

这两项分析是评估非小细胞肺癌(NSCLC)和肾细胞癌(RCC)干预措施的分区生存模型。我们开发了提示,指导GPT-4在R语言中编写NSCLC和RCC模型的程序,并提供每个模型的方法、假设和参数值的描述。将生成脚本的结果与原始人工编写模型的已发表值进行比较。对模型进行了15次复制,以捕捉GPT-4输出的变异性。

结果

GPT-4以高精度完全复制了NSCLC模型:人工智能(AI)生成的NSCLC模型中有100%(15/15)无错误或包含一个小错误,93%(14/15)完全无错误。GPT-4紧密复制了RCC模型,尽管需要人工干预来简化模型设计的一个元素(模型的十五个输入计算之一),因为它使用了太多顺序步骤,无法在单个提示中实现。经过这种简化,AI生成的RCC模型中有87%(13/15)无错误或包含一个小错误,60%(9/15)完全无错误。无错误的模型脚本将已发表的增量成本效益比复制到1%以内。

结论

本研究提供了一个有前景的迹象,表明GPT-4可在卫生经济模型构建自动化中具有实际应用。潜在的好处包括加快模型开发时间表和降低开发成本。有必要进行进一步研究,以探索基于大型语言模型的自动化在更大模型样本中的通用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6cf/10884386/5d1c9a59f93b/41669_2024_477_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验