Suppr超能文献

从射血分数降低的心力衰竭患者的自由文本处方签名中提取每日剂量的方法:一项比较研究。

Approaches for extracting daily dosage from free-text prescription signatures in heart failure with reduced ejection fraction: a comparative study.

作者信息

Haaker Theodorus S, Choi Joshua S, Nanjo Claude J, Warner Phillip B, Abu-Hanna Ameen, Kawamoto Kensaku

机构信息

Department of Biomedical Informatics, University of Utah, Salt Lake City, UT 84108, United States.

Department of Medical Informatics, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands.

出版信息

JAMIA Open. 2025 Jan 3;8(1):ooae153. doi: 10.1093/jamiaopen/ooae153. eCollection 2025 Feb.

Abstract

OBJECTIVE

To compare various methods for extracting daily dosage information from prescription signatures (sigs) and identify the best performers.

MATERIALS AND METHODS

In this study, 5 daily dosage extraction methods were identified. Parsigs, RxSig, Sig2db, a large language model (LLM), and a bidirectional long short-term memory (BiLSTM) model were selected. The methods were analyzed with regard to positive predictive value (PPV), sensitivity, F1-score, cost to compute, and time to finish on a sig dataset in the context of heart failure with reduced ejection fraction.

RESULTS

The dataset consisted of 29 896 free-text sigs, which were split into training and validation sets of 70% and 30%, respectively. The BiLSTM model scored lowest with an F1-score of 0.71. The LLM GPT-4o and regular expression-based RxSig achieved the highest F1-scores with 0.98 and 0.95, respectively. The LLM outperformed RxSig in sensitivity. RxSig outperformed the LLM in PPV. Additionally, RxSig had a lower run time and no costs compared to a cost of 25 dollars.

DISCUSSION

In practical usage, it would be preferable for an algorithm to score high on PPV and F1-score, to reduce false positive assertions of daily dosage. Additionally, long running times and high costs are not scalable for larger datasets. Thus, RxSig is likely the most scalable approach. Further research is needed to investigate the generalizability of the findings.

CONCLUSION

This study demonstrates that both the LLM and RxSig models excel in daily dose extraction from free-text sigs, with the RxSig model appearing to be the more scalable approach.

摘要

目的

比较从处方签名(sig)中提取每日剂量信息的各种方法,并确定最佳方法。

材料与方法

在本研究中,确定了5种每日剂量提取方法。选择了Parsigs、RxSig、Sig2db、一个大语言模型(LLM)和一个双向长短期记忆(BiLSTM)模型。在射血分数降低的心力衰竭背景下,在一个sig数据集上,对这些方法的阳性预测值(PPV)、敏感性、F1分数、计算成本和完成时间进行了分析。

结果

该数据集由29896条自由文本sig组成,分别分为70%和30%的训练集和验证集。BiLSTM模型得分最低,F1分数为0.71。LLM GPT-4o和基于正则表达式的RxSig分别以0.98和0.95的F1分数取得了最高得分。LLM在敏感性方面优于RxSig。RxSig在PPV方面优于LLM。此外,与25美元的成本相比,RxSig的运行时间更短且无成本。

讨论

在实际应用中,算法在PPV和F1分数上得分高会更可取,以减少每日剂量的假阳性断言。此外,对于更大的数据集,长时间运行和高成本是不可扩展的。因此,RxSig可能是最具可扩展性的方法。需要进一步研究以调查这些发现的普遍性。

结论

本研究表明,LLM和RxSig模型在从自由文本sig中提取每日剂量方面都表现出色,其中RxSig模型似乎是更具可扩展性的方法。

相似文献

本文引用的文献

1
ChatGPT in healthcare: A taxonomy and systematic review.ChatGPT 在医疗保健中的应用:分类法与系统综述。
Comput Methods Programs Biomed. 2024 Mar;245:108013. doi: 10.1016/j.cmpb.2024.108013. Epub 2024 Jan 15.
2
Revisiting Relation Extraction in the era of Large Language Models.重访大语言模型时代的关系抽取
Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:15566-15589. doi: 10.18653/v1/2023.acl-long.868.
7
Enhancing clinical concept extraction with contextual embeddings.利用上下文嵌入增强临床概念提取。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1297-1304. doi: 10.1093/jamia/ocz096.
10
Clinical information extraction applications: A literature review.临床信息提取应用:文献综述。
J Biomed Inform. 2018 Jan;77:34-49. doi: 10.1016/j.jbi.2017.11.011. Epub 2017 Nov 21.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验