采用半自动方法提高 OMOP 词汇表中疫苗概念图的质量。

Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method.

机构信息

Department of Neurology, The University of Texas Health Science Center at Houston, Houston, TX, USA.

Odysseus Data Services, Cambridge, MA, USA.

出版信息

J Biomed Inform. 2022 Oct;134:104162. doi: 10.1016/j.jbi.2022.104162. Epub 2022 Aug 25.

DOI:10.1016/j.jbi.2022.104162

PMID:36029954

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9940475/

Abstract

The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) provides a unified model to integrate disparate real-world data (RWD) sources. An integral part of the OMOP CDM is the Standardized Vocabularies (henceforth referred to as the OMOP vocabulary), which enables organization and standardization of medical concepts across various clinical domains of the OMOP CDM. For concepts with the same meaning from different source vocabularies, one is designated as the standard concept, while the others are specified as non-standard or source concepts and mapped to the standard one. However, due to the heterogeneity of source vocabularies, there may exist mapping issues such as erroneous mappings and missing mappings in the OMOP vocabulary, which could affect the results of downstream analyses with RWD. In this paper, we focus on quality assurance of vaccine concept mappings in the OMOP vocabulary, which is necessary to accurately harness the power of RWD on vaccines. We introduce a semi-automated lexical approach to audit vaccine mappings in the OMOP vocabulary. We generated two types of vaccine-pairs: mapped and unmapped, where mapped vaccine-pairs are pairs of vaccine concepts with a "Maps to" relationship, while unmapped vaccine-pairs are those without a "Maps to" relationship. We represented each vaccine concept name as a set of words, and derived term-difference pairs (i.e., name differences) for mapped and unmapped vaccine-pairs. If the same term-difference pair can be obtained by both mapped and unmapped vaccine-pairs, then this is considered as a potential mapping inconsistency. Applying this approach to the vaccine mappings in OMOP, a total of 2087 potentially mapping inconsistencies were obtained. A randomly selected 200 samples were evaluated by domain experts to identify, validate, and categorize the inconsistencies. Experts identified 95 cases revealing valid mapping issues. The remaining 105 cases were found to be invalid due to the external and/or contextual information used in the mappings that were not reflected in the concept names of vaccines. This indicates that our semi-automated approach shows promise in identifying mapping inconsistencies among vaccine concepts in the OMOP vocabulary.

摘要

观察性医学结局伙伴关系（OMOP）通用数据模型（CDM）提供了一个统一的模型，用于整合不同的真实世界数据（RWD）来源。OMOP CDM 的一个组成部分是标准化词汇表（简称 OMOP 词汇表），它能够实现 OMOP CDM 各个临床领域的医学概念的组织和标准化。对于来自不同源词汇表的具有相同含义的概念，其中一个被指定为标准概念，而其他则被指定为非标准或源概念，并映射到标准概念上。然而，由于源词汇表的异质性，OMOP 词汇表中可能存在映射错误和映射缺失等问题，这可能会影响使用 RWD 进行下游分析的结果。本文重点关注 OMOP 词汇表中疫苗概念映射的质量保证，这对于准确利用 RWD 疫苗的力量是必要的。我们引入了一种半自动化的词汇方法来审核 OMOP 词汇表中的疫苗映射。我们生成了两种类型的疫苗对：映射对和未映射对，其中映射对是具有“Maps to”关系的疫苗概念对，而未映射对是没有“Maps to”关系的疫苗对。我们将每个疫苗概念名称表示为一组单词，并为映射对和未映射对生成术语差异对（即名称差异）。如果映射对和未映射对都可以得到相同的术语差异对，则认为这是潜在的映射不一致。将这种方法应用于 OMOP 中的疫苗映射，总共得到了 2087 个潜在的映射不一致。随机选择了 200 个样本由领域专家进行评估，以识别、验证和分类不一致。专家发现了 95 个案例，揭示了有效的映射问题。其余 105 个案例被认为是无效的，因为映射中使用的外部和/或上下文信息没有反映在疫苗名称中。这表明，我们的半自动方法在识别 OMOP 词汇表中疫苗概念之间的映射不一致方面具有潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f27a/9940475/507bb4737a53/nihms-1870816-f0001.jpg

相似文献

Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method.

J Biomed Inform. 2022 Oct;134:104162. doi: 10.1016/j.jbi.2022.104162. Epub 2022 Aug 25.

Assessing the Use of German Claims Data Vocabularies for Research in the Observational Medical Outcomes Partnership Common Data Model: Development and Evaluation Study.

JMIR Med Inform. 2023 Nov 7;11:e47959. doi: 10.2196/47959.

Content Coverage Evaluation of the OMOP Vocabulary on the Transplant Domain Focusing on Concepts Relevant for Kidney Transplant Outcomes Analysis.

Appl Clin Inform. 2020 Aug;11(4):650-658. doi: 10.1055/s-0040-1716528. Epub 2020 Oct 7.

Advancing Toward a Common Data Model in Ophthalmology: Gap Analysis of General Eye Examination Concepts to Standard Observational Medical Outcomes Partnership (OMOP) Concepts.

Ophthalmol Sci. 2023 Aug 25;3(4):100391. doi: 10.1016/j.xops.2023.100391. eCollection 2023 Dec.

Methodological Issues in Using a Common Data Model of COVID-19 Vaccine Uptake and Important Adverse Events of Interest: Feasibility Study of Data and Connectivity COVID-19 Vaccines Pharmacovigilance in the United Kingdom.

JMIR Form Res. 2022 Aug 22;6(8):e37821. doi: 10.2196/37821.

Incorporation of Korean Electronic Data Interchange Vocabulary into Observational Medical Outcomes Partnership Vocabulary.

Healthc Inform Res. 2021 Jan;27(1):29-38. doi: 10.4258/hir.2021.27.1.29. Epub 2021 Jan 31.

Transforming Anesthesia Data Into the Observational Medical Outcomes Partnership Common Data Model: Development and Usability Study.

J Med Internet Res. 2021 Oct 29;23(10):e29259. doi: 10.2196/29259.

Transforming Primary Care Data Into the Observational Medical Outcomes Partnership Common Data Model: Development and Usability Study.

JMIR Med Inform. 2024 Aug 13;12:e49542. doi: 10.2196/49542.

Eos and OMOCL: Towards a seamless integration of openEHR records into the OMOP Common Data Model.

J Biomed Inform. 2023 Aug;144:104437. doi: 10.1016/j.jbi.2023.104437. Epub 2023 Jul 12.

Standardizing registry data to the OMOP Common Data Model: experience from three pulmonary hypertension databases.

BMC Med Res Methodol. 2021 Nov 2;21(1):238. doi: 10.1186/s12874-021-01434-3.

引用本文的文献

Mapping and Harmonization of CVX vaccine terms to the Vaccine Ontology.

bioRxiv. 2025 Jul 18:2025.07.15.664501. doi: 10.1101/2025.07.15.664501.

Mapping vaccine names in clinical trials to vaccine ontology using cascaded fine-tuned domain-specific language models.

J Biomed Semantics. 2024 Aug 10;15(1):14. doi: 10.1186/s13326-024-00318-x.

Mapping Vaccine Names in Clinical Trials to Vaccine Ontology using Cascaded Fine-Tuned Domain-Specific Language Models.

Res Sq. 2023 Sep 27:rs.3.rs-3362256. doi: 10.21203/rs.3.rs-3362256/v1.

Identifying Missing IS-A Relations in Orphanet Rare Disease Ontology.

Proceedings (IEEE Int Conf Bioinformatics Biomed). 2022 Dec;2022:3274-3279. doi: 10.1109/bibm55620.2022.9995614. Epub 2023 Jan 2.

本文引用的文献

Methods to evaluate serogroup B meningococcal vaccines: From predictions to real-world evidence.

J Infect. 2020 Dec;81(6):862-872. doi: 10.1016/j.jinf.2020.07.034. Epub 2020 Jul 31.

Success of 4CMenB in preventing meningococcal disease: evidence from real-world experience.

Arch Dis Child. 2020 Aug;105(8):784-790. doi: 10.1136/archdischild-2019-318047. Epub 2020 Feb 6.

Improving the interoperability of biomedical ontologies with compound alignments.

J Biomed Semantics. 2018 Jan 9;9(1):1. doi: 10.1186/s13326-017-0171-8.

Case-control vaccine effectiveness studies: Preparation, design, and enrollment of cases and controls.

Vaccine. 2017 Jun 5;35(25):3295-3302. doi: 10.1016/j.vaccine.2017.04.037. Epub 2017 Apr 22.

ICD-10: History and Context.

AJNR Am J Neuroradiol. 2016 Apr;37(4):596-9. doi: 10.3174/ajnr.A4696. Epub 2016 Jan 28.

CPT Codes: What Are They, Why Are They Necessary, and How Are They Developed?

Adv Wound Care (New Rochelle). 2013 Dec;2(10):583-587. doi: 10.1089/wound.2013.0483.

Current procedural terminology; a primer.

J Neurointerv Surg. 2015 Apr;7(4):309-12. doi: 10.1136/neurintsurg-2014-011156. Epub 2014 Mar 3.

Risk of rheumatoid arthritis following vaccination with tetanus, influenza and hepatitis B vaccines among persons 15-59 years of age.

Vaccine. 2011 Sep 2;29(38):6592-7. doi: 10.1016/j.vaccine.2011.06.112. Epub 2011 Jul 16.

The Vaccine Safety Datalink: a model for monitoring immunization safety.

Pediatrics. 2011 May;127 Suppl 1:S45-53. doi: 10.1542/peds.2010-1722H. Epub 2011 Apr 18.

SNOMED-CT: The advanced terminology and coding system for eHealth.

Stud Health Technol Inform. 2006;121:279-90.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

采用半自动方法提高 OMOP 词汇表中疫苗概念图的质量。

Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献