• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过Elasticsearch在整合生物学与床边信息学(i2b2)模型中整合医疗保健数据:法国大学医院的设计、实施与评估

Integrating Health Care Data in an Informatics for Integrating Biology & the Bedside (i2b2) Model Persisted Through Elasticsearch: Design, Implementation, and Evaluation in a French University Hospital.

作者信息

Griffier Romain, Mougin Fleur, Jouhet Vianney

机构信息

Service d'Information Médicale, Informatique et Archivistique Médicale (IAM), Pôle de Santé Publique, Bordeaux University Hospital, Bordeaux, France.

Team AHeaD, Inserm Bordeaux Population Health Research Center, UMR 1219, Bordeaux University, Bordeaux, France.

出版信息

JMIR Med Inform. 2025 Apr 24;13:e65753. doi: 10.2196/65753.

DOI:10.2196/65753
PMID:40273445
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12062766/
Abstract

BACKGROUND

The volume of digital data in health care is continually growing. In addition to its use in health care, the health data collected can also serve secondary purposes, such as research. In this context, clinical data warehouses (CDWs) provide the infrastructure and organization necessary to enhance the secondary use of health data. Various data models have been proposed for structuring data in a CDW, including the Informatics for Integrating Biology & the Bedside (i2b2) model, which relies on a relational database. However, this persistence approach can lead to performance issues when executing queries on massive data sets.

OBJECTIVE

This study aims to describe the necessary transformations and their implementation to enable i2b2's search engine to perform the phenotyping task using data persistence in a NoSQL Elasticsearch database.

METHODS

This study compares data persistence in a standard relational database with a NoSQL Elasticsearch database in terms of query response and execution performance (focusing on counting queries based on structured data, numerical data, and free text, including temporal filtering) as well as material resource requirements. Additionally, the data loading and updating processes are described.

RESULTS

We propose adaptations to the i2b2 model to accommodate the specific features of Elasticsearch, particularly its inability to perform joins between different indexes. The implementation was tested and evaluated within the CDW of Bordeaux University Hospital, which contains data on 2.5 million patients and over 3 billion observations. Overall, Elasticsearch achieves shorter query execution times compared with a relational database, with particularly significant performance gains for free-text searches. Additionally, compared with an indexed relational database (including a full-text index), Elasticsearch requires less disk space for storage.

CONCLUSIONS

We demonstrate that implementing i2b2 with Elasticsearch is feasible and significantly improves query performance while reducing disk space usage. This implementation is currently in production at Bordeaux University Hospital.

摘要

背景

医疗保健领域的数字数据量在持续增长。除了用于医疗保健本身,所收集的健康数据还可用于诸如研究等次要目的。在此背景下,临床数据仓库(CDW)提供了增强健康数据二次利用所需的基础设施和组织架构。已经提出了各种数据模型用于在CDW中构建数据,包括整合生物学与床边信息学(i2b2)模型,该模型依赖于关系数据库。然而,这种持久化方法在对海量数据集执行查询时可能会导致性能问题。

目的

本研究旨在描述必要的转换及其实现方式,以使i2b2的搜索引擎能够使用NoSQL Elasticsearch数据库中的数据持久化来执行表型分析任务。

方法

本研究在查询响应和执行性能(重点是基于结构化数据、数值数据和自由文本的计数查询,包括时间过滤)以及物质资源需求方面,比较了标准关系数据库与NoSQL Elasticsearch数据库中的数据持久化情况。此外,还描述了数据加载和更新过程。

结果

我们建议对i2b2模型进行调整,以适应Elasticsearch的特定特性,特别是其无法在不同索引之间执行连接操作的特性。该实现在波尔多大学医院的CDW中进行了测试和评估,该CDW包含250万患者的数据和超过30亿条观测数据。总体而言,与关系数据库相比,Elasticsearch实现了更短的查询执行时间,对于自由文本搜索的性能提升尤为显著。此外,与索引关系数据库(包括全文索引)相比,Elasticsearch存储所需的磁盘空间更少。

结论

我们证明了使用Elasticsearch实现i2b2是可行的,并且在显著提高查询性能的同时减少了磁盘空间使用。此实现在波尔多大学医院目前已投入使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f454/12062766/7dc11cd1f56d/medinform_v13i1e65753_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f454/12062766/9e0efeaf7713/medinform_v13i1e65753_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f454/12062766/51d41cc041a5/medinform_v13i1e65753_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f454/12062766/0ce4794d7122/medinform_v13i1e65753_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f454/12062766/7dc11cd1f56d/medinform_v13i1e65753_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f454/12062766/9e0efeaf7713/medinform_v13i1e65753_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f454/12062766/51d41cc041a5/medinform_v13i1e65753_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f454/12062766/0ce4794d7122/medinform_v13i1e65753_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f454/12062766/7dc11cd1f56d/medinform_v13i1e65753_fig4.jpg

相似文献

1
Integrating Health Care Data in an Informatics for Integrating Biology & the Bedside (i2b2) Model Persisted Through Elasticsearch: Design, Implementation, and Evaluation in a French University Hospital.通过Elasticsearch在整合生物学与床边信息学(i2b2)模型中整合医疗保健数据:法国大学医院的设计、实施与评估
JMIR Med Inform. 2025 Apr 24;13:e65753. doi: 10.2196/65753.
2
Computing health quality measures using Informatics for Integrating Biology and the Bedside.使用整合生物学与床边信息学计算健康质量指标。
J Med Internet Res. 2013 Apr 19;15(4):e75. doi: 10.2196/jmir.2493.
3
A Fast Healthcare Interoperability Resources (FHIR) layer implemented over i2b2.基于 i2b2 实现的快速医疗互操作性资源(FHIR)层。
BMC Med Inform Decis Mak. 2017 Aug 14;17(1):120. doi: 10.1186/s12911-017-0513-6.
4
Metadata Import from RDF to i2b2.从资源描述框架(RDF)到整合生物学信息库(i2b2)的元数据导入。
Stud Health Technol Inform. 2018;253:40-44.
5
The Georges Pompidou University Hospital Clinical Data Warehouse: A 8-years follow-up experience.乔治·蓬皮杜大学医院临床数据仓库:8年随访经验
Int J Med Inform. 2017 Jun;102:21-28. doi: 10.1016/j.ijmedinf.2017.02.006. Epub 2017 Feb 16.
6
Accessing OMOP Common Data Model Repositories with the i2b2 Webclient - Algorithm for Automatic Query Translation.使用i2b2网络客户端访问OMOP通用数据模型存储库——自动查询翻译算法
Stud Health Technol Inform. 2021 May 24;278:251-259. doi: 10.3233/SHTI210077.
7
An ICT infrastructure to integrate clinical and molecular data in oncology research.一种整合肿瘤学研究中临床和分子数据的 ICT 基础设施。
BMC Bioinformatics. 2012 Mar 28;13 Suppl 4(Suppl 4):S5. doi: 10.1186/1471-2105-13-S4-S5.
8
Web services for data warehouses: OMOP and PCORnet on i2b2.数据仓库的 Web 服务:i2b2 上的 OMOP 和 PCORnet。
J Am Med Inform Assoc. 2018 Oct 1;25(10):1331-1338. doi: 10.1093/jamia/ocy093.
9
Data interchange using i2b2.使用i2b2进行数据交换。
J Am Med Inform Assoc. 2016 Sep;23(5):909-15. doi: 10.1093/jamia/ocv188. Epub 2016 Feb 5.
10
An interactive dashboard for analyzing user interaction patterns in the i2b2 clinical data warehouse.用于分析 i2b2 临床数据仓库中用户交互模式的交互式仪表板。
BMC Med Inform Decis Mak. 2024 Nov 11;24(1):333. doi: 10.1186/s12911-024-02748-0.

本文引用的文献

1
Machine Learning for Medical Data Integration.机器学习在医学数据集成中的应用。
Stud Health Technol Inform. 2023 May 18;302:691-695. doi: 10.3233/SHTI230241.
2
Acute respiratory distress syndrome after SARS-CoV-2 infection on young adult population: International observational federated study based on electronic health records through the 4CE consortium.新型冠状病毒感染后青壮年人群急性呼吸窘迫综合征:基于 4CE 联盟电子健康记录的国际观察性联合研究。
PLoS One. 2023 Jan 4;18(1):e0266985. doi: 10.1371/journal.pone.0266985. eCollection 2023.
3
The benefit of augmenting open data with clinical data-warehouse EHR for forecasting SARS-CoV-2 hospitalizations in Bordeaux area, France.
利用临床数据仓库电子健康记录扩充开放数据以预测法国波尔多地区新冠病毒住院情况的益处。
JAMIA Open. 2022 Nov 11;5(4):ooac086. doi: 10.1093/jamiaopen/ooac086. eCollection 2022 Dec.
4
Towards the Use of Big Data in Healthcare: A Literature Review.论大数据在医疗保健中的应用:文献综述
Healthcare (Basel). 2022 Jul 1;10(7):1232. doi: 10.3390/healthcare10071232.
5
Validation of an Electronic Phenotyping Algorithm for Patients With Acute Respiratory Failure.急性呼吸衰竭患者电子表型算法的验证
Crit Care Explor. 2022 Mar 1;4(3):e0645. doi: 10.1097/CCE.0000000000000645. eCollection 2022 Mar.
6
Evaluation of Doc'EDS: a French semantic search tool to query health documents from a clinical data warehouse.Doc'EDS 评估:一种法语语义搜索工具,用于从临床数据仓库查询健康文档。
BMC Med Inform Decis Mak. 2022 Feb 8;22(1):34. doi: 10.1186/s12911-022-01762-4.
7
Secondary Use of Clinical Data in Data-Gathering, Non-Interventional Research or Learning Activities: Definition, Types, and a Framework for Risk Assessment.临床数据的二次使用在数据收集、非干预性研究或学习活动中的应用:定义、类型和风险评估框架。
J Med Internet Res. 2021 Jun 8;23(6):e26631. doi: 10.2196/26631.
8
OpenMRS as a global good: Impact, opportunities, challenges, and lessons learned from fifteen years of implementation.OpenMRS 作为全球公益:十五年实施历程中的影响、机遇、挑战与经验教训。
Int J Med Inform. 2021 May;149:104405. doi: 10.1016/j.ijmedinf.2021.104405. Epub 2021 Feb 5.
9
What You Need to Know Before Implementing a Clinical Research Data Warehouse: Comparative Review of Integrated Data Repositories in Health Care Institutions.实施临床研究数据仓库之前你需要了解的内容:医疗机构综合数据存储库的比较综述
JMIR Form Res. 2020 Aug 27;4(8):e17687. doi: 10.2196/17687.
10
Temporal phenotyping by mining healthcare data to derive lines of therapy for cancer.通过挖掘医疗保健数据进行时间表型分析,以得出癌症的治疗方案。
J Biomed Inform. 2019 Dec;100:103335. doi: 10.1016/j.jbi.2019.103335. Epub 2019 Nov 2.