Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA.
Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
J Am Med Inform Assoc. 2021 Mar 18;28(4):782-790. doi: 10.1093/jamia/ocaa291.
To develop an algorithm for building longitudinal medication dose datasets using information extracted from clinical notes in electronic health records (EHRs).
We developed an algorithm that converts medication information extracted using natural language processing (NLP) into a usable format and builds longitudinal medication dose datasets. We evaluated the algorithm on 2 medications extracted from clinical notes of Vanderbilt's EHR and externally validated the algorithm using clinical notes from the MIMIC-III clinical care database.
For the evaluation using Vanderbilt's EHR data, the performance of our algorithm was excellent; F1-measures were ≥0.98 for both dose intake and daily dose. For the external validation using MIMIC-III, the algorithm achieved F1-measures ≥0.85 for dose intake and ≥0.82 for daily dose.
Our algorithm addresses the challenge of building longitudinal medication dose data using information extracted from clinical notes. Overall performance was excellent, but the algorithm can perform poorly when incorrect information is extracted by NLP systems. Although it performed reasonably well when applied to the external data source, its performance was worse due to differences in the way the drug information was written. The algorithm is implemented in the R package, "EHR," and the extracted data from Vanderbilt's EHRs along with the gold standards are provided so that users can reproduce the results and help improve the algorithm.
Our algorithm for building longitudinal dose data provides a straightforward way to use EHR data for medication-based studies. The external validation results suggest its potential for applicability to other systems.
开发一种使用电子健康记录(EHR)中的临床记录中提取的信息构建纵向药物剂量数据集的算法。
我们开发了一种算法,该算法可将使用自然语言处理(NLP)提取的药物信息转换为可用格式,并构建纵向药物剂量数据集。我们使用范德比尔特 EHR 中的临床记录中的 2 种药物评估了该算法,并使用 MIMIC-III 临床护理数据库中的临床记录对该算法进行了外部验证。
对于使用范德比尔特 EHR 数据的评估,我们的算法性能非常出色;剂量摄入量和每日剂量的 F1 度量值均≥0.98。对于使用 MIMIC-III 的外部验证,该算法在剂量摄入量和每日剂量方面的 F1 度量值均≥0.85。
我们的算法解决了使用从临床记录中提取的信息构建纵向药物剂量数据的难题。总体性能非常出色,但当 NLP 系统提取错误信息时,算法的性能可能会很差。尽管将其应用于外部数据源时表现相当不错,但由于药物信息的编写方式不同,其性能会更差。该算法已在 R 包“EHR”中实现,并且提供了从范德比尔特 EHR 提取的数据以及黄金标准,以便用户可以复制结果并帮助改进算法。
我们用于构建纵向剂量数据的算法为使用 EHR 数据进行基于药物的研究提供了一种直接的方法。外部验证结果表明其适用于其他系统的潜力。