Becker Benedikt F H, Avillach Paul, Romio Silvana, van Mulligen Erik M, Weibel Daniel, Sturkenboom Miriam C J M, Kors Jan A
Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands.
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
Pharmacoepidemiol Drug Saf. 2017 Aug;26(8):998-1005. doi: 10.1002/pds.4245. Epub 2017 Jun 28.
Assessment of drug and vaccine effects by combining information from different healthcare databases in the European Union requires extensive efforts in the harmonization of codes as different vocabularies are being used across countries. In this paper, we present a web application called CodeMapper, which assists in the mapping of case definitions to codes from different vocabularies, while keeping a transparent record of the complete mapping process.
CodeMapper builds upon coding vocabularies contained in the Metathesaurus of the Unified Medical Language System. The mapping approach consists of three phases. First, medical concepts are automatically identified in a free-text case definition. Second, the user revises the set of medical concepts by adding or removing concepts, or expanding them to related concepts that are more general or more specific. Finally, the selected concepts are projected to codes from the targeted coding vocabularies. We evaluated the application by comparing codes that were automatically generated from case definitions by applying CodeMapper's concept identification and successive concept expansion, with reference codes that were manually created in a previous epidemiological study.
Automated concept identification alone had a sensitivity of 0.246 and positive predictive value (PPV) of 0.420 for reproducing the reference codes. Three successive steps of concept expansion increased sensitivity to 0.953 and PPV to 0.616.
Automatic concept identification in the case definition alone was insufficient to reproduce the reference codes, but CodeMapper's operations for concept expansion provide an effective, efficient, and transparent way for reproducing the reference codes.
通过整合欧盟不同医疗保健数据库中的信息来评估药物和疫苗效果,需要在代码协调方面付出巨大努力,因为不同国家使用的词汇表各不相同。在本文中,我们展示了一个名为CodeMapper的网络应用程序,它有助于将病例定义映射到来自不同词汇表的代码,同时保持完整映射过程的透明记录。
CodeMapper基于统一医学语言系统元词表中包含的编码词汇表构建。映射方法包括三个阶段。首先,在自由文本病例定义中自动识别医学概念。其次,用户通过添加或删除概念,或将其扩展为更通用或更具体的相关概念,来修改医学概念集。最后,将选定的概念投影到目标编码词汇表中的代码。我们通过比较应用CodeMapper的概念识别和连续概念扩展从病例定义中自动生成的代码,与在先前流行病学研究中手动创建的参考代码,来评估该应用程序。
仅自动概念识别在重现参考代码方面的灵敏度为0.246,阳性预测值(PPV)为0.420。连续三个步骤的概念扩展将灵敏度提高到0.953,PPV提高到0.616。
仅在病例定义中进行自动概念识别不足以重现参考代码,但CodeMapper的概念扩展操作提供了一种有效、高效且透明的方式来重现参考代码。