• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

数据网络参与中电子健康数据提取-转换-加载挑战的分类框架。

A Framework for Classification of Electronic Health Data Extraction-Transformation-Loading Challenges in Data Network Participation.

作者信息

Ong Toan, Pradhananga Rosina, Holve Erin, Kahn Michael G

机构信息

Department of Pediatrics University of Colorado Anschutz Medical Campus.

AcademyHealth.

出版信息

EGEMS (Wash DC). 2017 Jun 13;5(1):10. doi: 10.5334/egems.222.

DOI:10.5334/egems.222
PMID:29930958
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5994935/
Abstract

BACKGROUND

Contributing health data to national, regional, and local networks or registries requires data stored in local systems with local structures and codes to be extracted, transformed, and loaded into a standard format called a Common Data Model (CDM). These processes called Extract, Transform, Load (ETL) require data partners or contributors to invest in costly technical resources with specialized skills in data models, terminologies, and programming. Given the wide range of tasks, skills, and technologies required to transform data into a CDM, a classification of ETL challenges can help identify needed resources, which in turn may encourage data partners with less-technical capabilities to participate in data-sharing networks.

METHODS

We conducted key-informant interviews with data partner representatives to survey the ETL challenges faced in clinical data research networks (CDRNs) and registries. A list of ETL challenges, organized into six themes was vetted during a one-day workshop with a wide range of network stakeholders including data partners, researchers, and policy experts.

RESULTS

We identified 24 technical ETL challenges related to the data sharing process. All of these ETL challenges were rated as "important" or "very important" by workshop participants using a five point Likert scale. Based on these findings, a framework for categorizing ETL challenges according to ETL phases, themes, and levels of data network participation was developed.

CONCLUSIONS

Overcoming ETL technical challenges require significant investments in a broad array of information technologies and human resources. Identifying these technical obstacles can inform optimal resource allocation to minimize the barriers and cost of entry for new data partners into extant networks, which in turn can expand data networks' inclusiveness and diversity. This paper offers pertinent information and guiding framework that are relevant for data partners in ascertaining challenges associated with contributing data in data networks.

摘要

背景

要将健康数据提供给国家、区域和地方网络或登记处,就需要从存储在具有本地结构和编码的本地系统中的数据中提取、转换并加载到一种称为通用数据模型(CDM)的标准格式中。这些称为提取、转换、加载(ETL)的过程要求数据合作伙伴或提供者投入成本高昂的技术资源,并配备在数据模型、术语和编程方面具备专业技能的人员。鉴于将数据转换为CDM需要涉及广泛的任务、技能和技术,对ETL挑战进行分类有助于确定所需资源,进而可能鼓励技术能力较弱的数据合作伙伴参与数据共享网络。

方法

我们对数据合作伙伴代表进行了关键信息访谈,以调查临床数据研究网络(CDRN)和登记处面临的ETL挑战。在为期一天的研讨会上,与包括数据合作伙伴、研究人员和政策专家在内的广泛网络利益相关者一起审核了一份按六个主题组织的ETL挑战清单。

结果

我们确定了与数据共享过程相关的24项技术ETL挑战。研讨会参与者使用五点李克特量表将所有这些ETL挑战评为“重要”或“非常重要”。基于这些发现,开发了一个根据ETL阶段、主题和数据网络参与级别对ETL挑战进行分类的框架。

结论

克服ETL技术挑战需要在广泛的信息技术和人力资源方面进行大量投资。识别这些技术障碍可以为优化资源分配提供信息,以尽量减少新数据合作伙伴进入现有网络的障碍和成本,进而可以扩大数据网络的包容性和多样性。本文提供了与数据合作伙伴在确定数据网络中贡献数据相关挑战时相关的重要信息和指导框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b57/5994935/5924c1f1cae3/egems-5-1-222-g2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b57/5994935/2127e0be5748/egems-5-1-222-g1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b57/5994935/5924c1f1cae3/egems-5-1-222-g2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b57/5994935/2127e0be5748/egems-5-1-222-g1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b57/5994935/5924c1f1cae3/egems-5-1-222-g2.jpg

相似文献

1
A Framework for Classification of Electronic Health Data Extraction-Transformation-Loading Challenges in Data Network Participation.数据网络参与中电子健康数据提取-转换-加载挑战的分类框架。
EGEMS (Wash DC). 2017 Jun 13;5(1):10. doi: 10.5334/egems.222.
2
Dynamic-ETL: a hybrid approach for health data extraction, transformation and loading.动态ETL:一种用于健康数据提取、转换和加载的混合方法。
BMC Med Inform Decis Mak. 2017 Sep 13;17(1):134. doi: 10.1186/s12911-017-0532-3.
3
An ETL-process design for data harmonization to participate in international research with German real-world data based on FHIR and OMOP CDM.一种基于FHIR和OMOP CDM进行数据协调以参与德国真实世界数据国际研究的ETL流程设计。
Int J Med Inform. 2023 Jan;169:104925. doi: 10.1016/j.ijmedinf.2022.104925. Epub 2022 Nov 10.
4
An Extract-Transform-Load Process Design for the Incremental Loading of German Real-World Data Based on FHIR and OMOP CDM: Algorithm Development and Validation.基于FHIR和OMOP CDM的德国真实世界数据增量加载的提取-转换-加载流程设计:算法开发与验证
JMIR Med Inform. 2023 Aug 21;11:e47310. doi: 10.2196/47310.
5
Data interchange using i2b2.使用i2b2进行数据交换。
J Am Med Inform Assoc. 2016 Sep;23(5):909-15. doi: 10.1093/jamia/ocv188. Epub 2016 Feb 5.
6
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
7
A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies.一种用于上市后安全性研究中观察性医疗保健数据二次利用的语义转换方法。
Front Pharmacol. 2018 Apr 30;9:435. doi: 10.3389/fphar.2018.00435. eCollection 2018.
8
Towards ETL Processes to OMOP CDM Using Metadata and Modularization.使用元数据和模块化实现 OMOP CDM 的 ETL 流程。
Stud Health Technol Inform. 2023 May 18;302:751-752. doi: 10.3233/SHTI230256.
9
Extract, transform, load framework for the conversion of health databases to OMOP.健康数据库到 OMOP 的转换的提取、转换、加载框架。
PLoS One. 2022 Apr 11;17(4):e0266911. doi: 10.1371/journal.pone.0266911. eCollection 2022.
10
ADEpedia-on-OHDSI: A next generation pharmacovigilance signal detection platform using the OHDSI common data model.ADEpedia-on-OHDSI:使用 OHDSI 通用数据模型的下一代药物警戒信号检测平台。
J Biomed Inform. 2019 Mar;91:103119. doi: 10.1016/j.jbi.2019.103119. Epub 2019 Feb 7.

引用本文的文献

1
Converting Health Level 7 Clinical Document Architecture (CDA) documents to Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) by leveraging CDA Template definitions.通过利用CDA模板定义将卫生级别7临床文档架构(CDA)文档转换为观察性医疗结果合作组织通用数据模型(OMOP CDM)。
JAMIA Open. 2025 Mar 26;8(2):ooaf022. doi: 10.1093/jamiaopen/ooaf022. eCollection 2025 Apr.
2
The necessity of validity diagnostics when drawing causal inferences from observational data: lessons from a multi-database evaluation of the risk of non-infectious uveitis among patients exposed to Remicade.从观察性数据得出因果推断时进行有效性诊断的必要性:来自一项针对接受类克治疗的患者发生非感染性葡萄膜炎风险的多数据库评估的经验教训。
BMC Med Res Methodol. 2024 Dec 27;24(1):322. doi: 10.1186/s12874-024-02428-7.
3

本文引用的文献

1
Validating the extract, transform, load process used to populate a large clinical research database.验证用于填充大型临床研究数据库的提取、转换、加载过程。
Int J Med Inform. 2016 Oct;94:271-4. doi: 10.1016/j.ijmedinf.2016.07.009. Epub 2016 Jul 29.
2
Privacy, Security, and Patient Engagement: The Changing Health Data Governance Landscape.隐私、安全与患者参与:不断变化的健康数据治理格局
EGEMS (Wash DC). 2016 Mar 31;4(2):1261. doi: 10.13063/2327-9214.1261. eCollection 2016.
3
Governance Through Privacy, Fairness, and Respect for Individuals.
A Common Longitudinal Intensive Care Unit data Format (CLIF) to enable multi-institutional federated critical illness research.一种通用的纵向重症监护病房数据格式(CLIF),以促进多机构联合危重病研究。
medRxiv. 2024 Sep 4:2024.09.04.24313058. doi: 10.1101/2024.09.04.24313058.
4
MENDS-on-FHIR: leveraging the OMOP common data model and FHIR standards for national chronic disease surveillance.基于FHIR的MENDS:利用OMOP通用数据模型和FHIR标准进行国家慢性病监测。
JAMIA Open. 2024 May 29;7(2):ooae045. doi: 10.1093/jamiaopen/ooae045. eCollection 2024 Jul.
5
"In conferences, everyone goes 'health data is the future' ": an interview study on challenges in re-using EHR data for research in Clinical Data Warehouses.在会议上,每个人都在说“健康数据就是未来”:一项关于在临床数据仓库中重新使用电子健康记录数据进行研究所面临挑战的访谈研究。
AMIA Annu Symp Proc. 2024 Jan 11;2023:579-588. eCollection 2023.
6
MENDS-on-FHIR: Leveraging the OMOP common data model and FHIR standards for national chronic disease surveillance.基于快速医疗互操作性资源的医学事件网络数据系统:利用OMOP通用数据模型和快速医疗互操作性资源标准进行国家慢性病监测。
medRxiv. 2023 Nov 22:2023.08.09.23293900. doi: 10.1101/2023.08.09.23293900.
7
European Health Data & Evidence Network-learnings from building out a standardized international health data network.欧洲健康数据和证据网络——建立标准化国际健康数据网络的经验。
J Am Med Inform Assoc. 2023 Dec 22;31(1):209-219. doi: 10.1093/jamia/ocad214.
8
Factors Affecting the Quality of Person-Generated Wearable Device Data and Associated Challenges: Rapid Systematic Review.影响可穿戴设备数据质量的因素及相关挑战:快速系统综述。
JMIR Mhealth Uhealth. 2021 Mar 19;9(3):e20738. doi: 10.2196/20738.
9
Review of Clinical Research Informatics.临床研究信息学述评。
Yearb Med Inform. 2020 Aug;29(1):193-202. doi: 10.1055/s-0040-1701988. Epub 2020 Aug 21.
10
Developing a Regional Distributed Data Network for Surveillance of Chronic Health Conditions: The Colorado Health Observation Regional Data Service.开发用于慢性健康状况监测的区域分布式数据网络:科罗拉多健康观察区域数据服务
J Public Health Manag Pract. 2019 Sep/Oct;25(5):498-507. doi: 10.1097/PHH.0000000000000810.
通过隐私、公平和对个人的尊重进行治理。
EGEMS (Wash DC). 2016 Mar 31;4(2):1207. doi: 10.13063/2327-9214.1207. eCollection 2016.
4
Conversion and Data Quality Assessment of Electronic Health Record Data at a Korean Tertiary Teaching Hospital to a Common Data Model for Distributed Network Research.韩国一家三级教学医院的电子健康记录数据向分布式网络研究通用数据模型的转换与数据质量评估。
Healthc Inform Res. 2016 Jan;22(1):54-8. doi: 10.4258/hir.2016.22.1.54. Epub 2016 Jan 31.
5
Developing electronic data methods infrastructure to participate in collaborative research networks.开发电子数据方法基础设施以参与协作研究网络。
EGEMS (Wash DC). 2014 Dec 2;2(1):1126. doi: 10.13063/2327-9214.1126. eCollection 2014.
6
The HMO Research Network Virtual Data Warehouse: A Public Data Model to Support Collaboration.健康维护组织研究网络虚拟数据仓库:支持协作的公共数据模型。
EGEMS (Wash DC). 2014 Mar 24;2(1):1049. doi: 10.13063/2327-9214.1049. eCollection 2014.
7
Technical challenges for big data in biomedicine and health: data sources, infrastructure, and analytics.生物医学与健康领域大数据的技术挑战:数据来源、基础设施与分析
Yearb Med Inform. 2014 Aug 15;9(1):42-7. doi: 10.15265/IY-2014-0018.
8
Clinical research data warehouse governance for distributed research networks in the USA: a systematic review of the literature.美国分布式研究网络的临床研究数据仓库治理:文献系统评价。
J Am Med Inform Assoc. 2014 Jul-Aug;21(4):730-6. doi: 10.1136/amiajnl-2013-002370. Epub 2014 Mar 28.
9
Collaborative Chronic Care Networks (C3Ns) to transform chronic illness care.协作性慢性病照护网络(C3Ns)改变慢性病照护模式。
Pediatrics. 2013 Jun;131 Suppl 4(Suppl 4):S219-23. doi: 10.1542/peds.2012-3786J.
10
Data model considerations for clinical effectiveness researchers.临床效果研究人员的数据模型考虑因素。
Med Care. 2012 Jul;50 Suppl(0):S60-7. doi: 10.1097/MLR.0b013e318259bff4.