Haaker Theodorus S, Choi Joshua S, Nanjo Claude J, Warner Phillip B, Abu-Hanna Ameen, Kawamoto Kensaku
Department of Biomedical Informatics, University of Utah, Salt Lake City, UT 84108, United States.
Department of Medical Informatics, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands.
JAMIA Open. 2025 Jan 3;8(1):ooae153. doi: 10.1093/jamiaopen/ooae153. eCollection 2025 Feb.
To compare various methods for extracting daily dosage information from prescription signatures (sigs) and identify the best performers.
In this study, 5 daily dosage extraction methods were identified. Parsigs, RxSig, Sig2db, a large language model (LLM), and a bidirectional long short-term memory (BiLSTM) model were selected. The methods were analyzed with regard to positive predictive value (PPV), sensitivity, F1-score, cost to compute, and time to finish on a sig dataset in the context of heart failure with reduced ejection fraction.
The dataset consisted of 29 896 free-text sigs, which were split into training and validation sets of 70% and 30%, respectively. The BiLSTM model scored lowest with an F1-score of 0.71. The LLM GPT-4o and regular expression-based RxSig achieved the highest F1-scores with 0.98 and 0.95, respectively. The LLM outperformed RxSig in sensitivity. RxSig outperformed the LLM in PPV. Additionally, RxSig had a lower run time and no costs compared to a cost of 25 dollars.
In practical usage, it would be preferable for an algorithm to score high on PPV and F1-score, to reduce false positive assertions of daily dosage. Additionally, long running times and high costs are not scalable for larger datasets. Thus, RxSig is likely the most scalable approach. Further research is needed to investigate the generalizability of the findings.
This study demonstrates that both the LLM and RxSig models excel in daily dose extraction from free-text sigs, with the RxSig model appearing to be the more scalable approach.
比较从处方签名(sig)中提取每日剂量信息的各种方法,并确定最佳方法。
在本研究中,确定了5种每日剂量提取方法。选择了Parsigs、RxSig、Sig2db、一个大语言模型(LLM)和一个双向长短期记忆(BiLSTM)模型。在射血分数降低的心力衰竭背景下,在一个sig数据集上,对这些方法的阳性预测值(PPV)、敏感性、F1分数、计算成本和完成时间进行了分析。
该数据集由29896条自由文本sig组成,分别分为70%和30%的训练集和验证集。BiLSTM模型得分最低,F1分数为0.71。LLM GPT-4o和基于正则表达式的RxSig分别以0.98和0.95的F1分数取得了最高得分。LLM在敏感性方面优于RxSig。RxSig在PPV方面优于LLM。此外,与25美元的成本相比,RxSig的运行时间更短且无成本。
在实际应用中,算法在PPV和F1分数上得分高会更可取,以减少每日剂量的假阳性断言。此外,对于更大的数据集,长时间运行和高成本是不可扩展的。因此,RxSig可能是最具可扩展性的方法。需要进一步研究以调查这些发现的普遍性。
本研究表明,LLM和RxSig模型在从自由文本sig中提取每日剂量方面都表现出色,其中RxSig模型似乎是更具可扩展性的方法。