基于数据驱动的迭代方法半自动评估药物值集的正确性：以阿片类药物为例的概念验证

A Data-Driven Iterative Approach for Semi-automatically Assessing the Correctness of Medication Value Sets: A Proof of Concept Based on Opioids.

机构信息

Department of Computer Science, Emory University, Atlanta, Georgia, United States.

Language Technologies Institute, Carnegie Mellon University, Pennsylvania, United States.

出版信息

Methods Inf Med. 2021 Dec;60(S 02):e111-e119. doi: 10.1055/s-0041-1740358. Epub 2021 Dec 29.

DOI:10.1055/s-0041-1740358

PMID:34965602

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8716187/

Abstract

BACKGROUND

Value sets are lists of terms (e.g., opioid medication names) and their corresponding codes from standard clinical vocabularies (e.g., RxNorm) created with the intent of supporting health information exchange and research. Value sets are manually-created and often exhibit errors.

OBJECTIVES

The aim of the study is to develop a semi-automatic, data-centric natural language processing (NLP) method to assess medication-related value set correctness and evaluate it on a set of opioid medication value sets.

METHODS

We developed an NLP algorithm that utilizes value sets containing mostly true positives and true negatives to learn lexical patterns associated with the true positives, and then employs these patterns to identify potential errors in unseen value sets. We evaluated the algorithm on a set of opioid medication value sets, using the recall, precision and F-score metrics. We applied the trained model to assess the correctness of unseen opioid value sets based on recall. To replicate the application of the algorithm in real-world settings, a domain expert manually conducted error analysis to identify potential system and value set errors.

RESULTS

Thirty-eight value sets were retrieved from the Value Set Authority Center, and six (two opioid, four non-opioid) were used to develop and evaluate the system. Average precision, recall, and F-score were 0.932, 0.904, and 0.909, respectively on uncorrected value sets; and 0.958, 0.953, and 0.953, respectively after manual correction of the same value sets. On 20 unseen opioid value sets, the algorithm obtained average recall of 0.89. Error analyses revealed that the main sources of system misclassifications were differences in how opioids were coded in the value sets-while the training value sets had generic names mostly, some of the unseen value sets had new trade names and ingredients.

CONCLUSION

The proposed approach is data-centric, reusable, customizable, and not resource intensive. It may help domain experts to easily validate value sets.

摘要

背景

值集是从标准临床词汇（如 RxNorm）中创建的术语（如阿片类药物名称）及其相应代码的列表，旨在支持健康信息交换和研究。值集是手动创建的，通常存在错误。

目的

本研究旨在开发一种半自动、以数据为中心的自然语言处理（NLP）方法，以评估药物相关值集的正确性，并在一组阿片类药物值集上对其进行评估。

方法

我们开发了一种 NLP 算法，该算法利用包含大多数真阳性和真阴性的值集来学习与真阳性相关的词汇模式，然后利用这些模式识别未见过的值集中的潜在错误。我们使用召回率、精度和 F 分数指标在一组阿片类药物值集上评估该算法。我们根据召回率应用训练好的模型来评估未见过的阿片类药物值集的正确性。为了在真实环境中复制算法的应用，一位领域专家手动进行错误分析，以识别潜在的系统和值集错误。

结果

从值集管理局中心检索到 38 个值集，其中 6 个（2 个阿片类药物，4 个非阿片类药物）用于开发和评估系统。未经校正的值集的平均精度、召回率和 F 分数分别为 0.932、0.904 和 0.909；经过相同值集的手动校正后，分别为 0.958、0.953 和 0.953。在 20 个未见过的阿片类药物值集上，该算法的平均召回率为 0.89。错误分析表明，系统错误分类的主要原因是值集中阿片类药物的编码方式不同——虽然训练值集主要是通用名称，但一些未见过的值集有新的商品名和成分。