用于观察性健康研究中后协调、本体开发和精确映射的Jackalope Plus工具。

Jackalope Plus tool for post-coordination, ontology development, and precise mapping in observational health studies.

作者信息

Trofymenko Maksym, Korchmar Eduard, Kaduk Denys, Vikhrak Marta, Khilchevskyi Bohdan, Nesmiian Tetiana, Talapova Polina, Ved Max, Ageeva Inna

机构信息

IT company SciForce, Kharkiv, Ukraine.

Taras Shevchenko National University of Kyiv, Kyiv, Ukraine.

出版信息

Sci Rep. 2025 Jul 2;15(1):23674. doi: 10.1038/s41598-025-04046-9.

DOI:10.1038/s41598-025-04046-9

PMID:40603892

Abstract

Accurate mapping of complex health data to the OMOP CDM while preserving clinical nuance remains a challenge. We introduce Jackalope Plus, a novel tool leveraging SNOMED CT post-coordination and a GPT-4o mini LLM, to significantly enhance the precision and efficiency of this process. Our two-step approach combines semantic search with LLM-driven standardization, enabling accurate conversion of intricate medical concepts. Evaluation on benchmark and custom datasets demonstrates that Jackalope Plus identifies correct mappings for over 77.5% of complex terminologies, substantially outperforming Usagi (52.5%) and matching the accuracy of manual mapping while offering up to 50% time savings. Jackalope Plus offers a versatile solution for diverse healthcare data environments. Future work will focus on refining the tool through user feedback integration and addressing ambiguities in overlapping concepts. A free beta version is available for research and feedback. Ethical review confirms no storage of patient-identifiable information.

摘要

在保留临床细微差别同时，将复杂的健康数据准确映射到OMOP通用数据模型仍然是一项挑战。我们推出了Jackalope Plus，这是一种利用SNOMED CT后置协调和GPT-4o小型语言模型的新型工具，可显著提高这一过程的精度和效率。我们的两步法将语义搜索与语言模型驱动的标准化相结合，能够准确转换复杂的医学概念。在基准数据集和自定义数据集上的评估表明，Jackalope Plus能为超过77.5%的复杂术语识别出正确的映射，大大优于Usagi（52.5%），并在节省高达50%时间的同时，达到了手动映射的准确性。Jackalope Plus为多样化的医疗数据环境提供了一个通用的解决方案。未来的工作将专注于通过整合用户反馈来完善该工具，并解决重叠概念中的模糊性问题。可提供免费的测试版以供研究和反馈。伦理审查确认不会存储可识别患者身份的信息。

相似文献

Jackalope Plus tool for post-coordination, ontology development, and precise mapping in observational health studies.用于观察性健康研究中后协调、本体开发和精确映射的Jackalope Plus工具。

Sci Rep. 2025 Jul 2;15(1):23674. doi: 10.1038/s41598-025-04046-9.

Converting OMOP CDM to phenopackets: A model alignment and patient data representation evaluation.将 OMOP CDM 转换为 phenopackets：模型对齐和患者数据表示评估。

J Biomed Inform. 2024 Jul;155:104659. doi: 10.1016/j.jbi.2024.104659. Epub 2024 May 21.

Technological aids for the rehabilitation of memory and executive functioning in children and adolescents with acquired brain injury.脑损伤儿童和青少年记忆与执行功能康复的技术辅助手段。

Cochrane Database Syst Rev. 2016 Jul 1;7(7):CD011020. doi: 10.1002/14651858.CD011020.pub2.

Patient safety classifications, taxonomies and ontologies: A systematic review on development and evaluation methodologies.患者安全分类、分类法和本体论：开发和评估方法的系统评价。

J Biomed Inform. 2022 Sep;133:104150. doi: 10.1016/j.jbi.2022.104150. Epub 2022 Jul 22.

Closed-system drug-transfer devices plus safe handling of hazardous drugs versus safe handling alone for reducing exposure to infusional hazardous drugs in healthcare staff.封闭系统药物转移装置加危险药物安全操作与仅进行安全操作相比，对减少医护人员接触输注性危险药物的影响

Cochrane Database Syst Rev. 2018 Mar 27;3(3):CD012860. doi: 10.1002/14651858.CD012860.pub2.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Ontology enrichment using a large language model: Applying lexical, semantic, and knowledge network-based similarity for concept placement.使用大语言模型进行本体丰富：将基于词汇、语义和知识网络的相似性应用于概念放置。

J Biomed Inform. 2025 Aug;168:104865. doi: 10.1016/j.jbi.2025.104865. Epub 2025 Jun 19.

Measures implemented in the school setting to contain the COVID-19 pandemic.学校为控制 COVID-19 疫情而采取的措施。

Cochrane Database Syst Rev. 2022 Jan 17;1(1):CD015029. doi: 10.1002/14651858.CD015029.

Shared decision-making interventions for people with mental health conditions.心理健康问题患者的共同决策干预措施。

Cochrane Database Syst Rev. 2022 Nov 11;11(11):CD007297. doi: 10.1002/14651858.CD007297.pub3.

Systemic treatments for metastatic cutaneous melanoma.转移性皮肤黑色素瘤的全身治疗

Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.

本文引用的文献

OHDSI Standardized Vocabularies-a large-scale centralized reference ontology for international data harmonization.OHDSI 标准化词汇表-用于国际数据协调的大规模集中参考本体。

J Am Med Inform Assoc. 2024 Feb 16;31(3):583-590. doi: 10.1093/jamia/ocad247.

Implementation of inclusion and exclusion criteria in clinical studies in OHDSI ATLAS software.在 OHDSI ATLAS 软件中实施临床研究的纳入和排除标准。

Sci Rep. 2023 Dec 18;13(1):22457. doi: 10.1038/s41598-023-49560-w.

Supporting SNOMED CT postcoordination with knowledge graph embeddings.利用知识图谱嵌入技术支持SNOMED CT后置协调。

J Biomed Inform. 2023 Mar;139:104297. doi: 10.1016/j.jbi.2023.104297. Epub 2023 Feb 1.

ETL Processes for Integrating Healthcare Data - Tools and Architecture Patterns.医疗保健数据集成的 ETL 流程 - 工具和架构模式。

Stud Health Technol Inform. 2022 Nov 3;299:151-156. doi: 10.3233/SHTI220974.

The Usage of OHDSI OMOP - A Scoping Review.OHDSI OMOP 的使用 - 范围综述。

Stud Health Technol Inform. 2021 Sep 21;283:95-103. doi: 10.3233/SHTI210546.

A PostgreSQL Tripal solution for large-scale genotypic and phenotypic data.一个用于大规模基因型和表型数据的 PostgreSQL Tripal 解决方案。

Database (Oxford). 2021 Aug 14;2021. doi: 10.1093/database/baab051.

The use of SNOMED CT, 2013-2020: a literature review.SNOMED CT 的使用，2013-2020：文献综述。

J Am Med Inform Assoc. 2021 Aug 13;28(9):2017-2026. doi: 10.1093/jamia/ocab084.

Logical Observation Identifiers Names and Codes for Laboratorians.逻辑观察标识符命名与检验规范

Arch Pathol Lab Med. 2020 Feb;144(2):229-239. doi: 10.5858/arpa.2018-0477-RA. Epub 2019 Jun 20.

ICD-10-PCS extension with ICD-9 procedure codes to support integrated access to clinical legacy data.ICD-10-PCS 扩展与 ICD-9 手术代码相结合，以支持对临床遗留数据的集成访问。

Int J Med Inform. 2019 Feb;122:70-79. doi: 10.1016/j.ijmedinf.2018.11.002. Epub 2018 Nov 16.

Inferring ontology graph structures using OWL reasoning.利用 owl 推理推断本体图结构。

BMC Bioinformatics. 2018 Jan 5;19(1):7. doi: 10.1186/s12859-017-1999-8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于观察性健康研究中后协调、本体开发和精确映射的Jackalope Plus工具。

Jackalope Plus tool for post-coordination, ontology development, and precise mapping in observational health studies.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献