• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于人群研究的数据协调与联合分析:BioSHaRE项目。

Data harmonization and federated analysis of population-based studies: the BioSHaRE project.

作者信息

Doiron Dany, Burton Paul, Marcon Yannick, Gaye Amadou, Wolffenbuttel Bruce H R, Perola Markus, Stolk Ronald P, Foco Luisa, Minelli Cosetta, Waldenberger Melanie, Holle Rolf, Kvaløy Kirsti, Hillege Hans L, Tassé Anne-Marie, Ferretti Vincent, Fortier Isabel

机构信息

Research Institute of the McGill University Health Centre, 2155 Guy, office 458, Montreal, Quebec H3H 2R9, Canada.

Public Population Project in Genomics and Society, Montreal, Canada.

出版信息

Emerg Themes Epidemiol. 2013 Nov 21;10(1):12. doi: 10.1186/1742-7622-10-12.

DOI:10.1186/1742-7622-10-12
PMID:24257327
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4175511/
Abstract

BACKGROUND

Individual-level data pooling of large population-based studies across research centres in international research projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence in the European Union) project aims to address these issues by building a collaborative group of investigators and developing tools for data harmonization, database integration and federated data analyses.

METHODS

Eight population-based studies in six European countries were recruited to participate in the BioSHaRE project. Through workshops, teleconferences and electronic communications, participating investigators identified a set of 96 variables targeted for harmonization to answer research questions of interest. Using each study's questionnaires, standard operating procedures, and data dictionaries, harmonization potential was assessed. Whenever harmonization was deemed possible, processing algorithms were developed and implemented in an open-source software infrastructure to transform study-specific data into the target (i.e. harmonized) format. Harmonized datasets located on server in each research centres across Europe were interconnected through a federated database system to perform statistical analysis.

RESULTS

Retrospective harmonization led to the generation of common format variables for 73% of matches considered (96 targeted variables across 8 studies). Authenticated investigators can now perform complex statistical analyses of harmonized datasets stored on distributed servers without actually sharing individual-level data using the DataSHIELD method.

CONCLUSION

New Internet-based networking technologies and database management systems are providing the means to support collaborative, multi-center research in an efficient and secure manner. The results from this pilot project show that, given a strong collaborative relationship between participating studies, it is possible to seamlessly co-analyse internationally harmonized research databases while allowing each study to retain full control over individual-level data. We encourage additional collaborative research networks in epidemiology, public health, and the social sciences to make use of the open source tools presented herein.

摘要

背景

在国际研究项目中,跨研究中心对基于大量人群的研究进行个体层面的数据整合面临诸多障碍。BioSHaRE(欧盟卓越研究生物样本库标准化与协调)项目旨在通过组建一个调查人员协作小组并开发数据协调、数据库整合和联合数据分析工具来解决这些问题。

方法

招募了六个欧洲国家的八项基于人群的研究参与BioSHaRE项目。通过研讨会、电话会议和电子通信,参与的调查人员确定了一组96个旨在协调的变量,以回答感兴趣的研究问题。利用每项研究的问卷、标准操作程序和数据字典,评估了协调潜力。只要认为有可能进行协调,就会开发处理算法并在开源软件基础设施中实施,以将特定研究的数据转换为目标(即协调后的)格式。欧洲各地各研究中心服务器上的协调数据集通过联合数据库系统相互连接,以进行统计分析。

结果

回顾性协调为73%的考虑匹配项(八项研究中的96个目标变量)生成了通用格式变量。经过认证的调查人员现在可以使用DataSHIELD方法对存储在分布式服务器上的协调数据集进行复杂的统计分析,而无需实际共享个体层面的数据。

结论

新的基于互联网的网络技术和数据库管理系统正在提供手段,以高效、安全的方式支持协作性多中心研究。这个试点项目的结果表明,鉴于参与研究之间有强大的协作关系,有可能无缝地共同分析国际协调的研究数据库,同时允许每项研究对个体层面的数据保持完全控制。我们鼓励流行病学、公共卫生和社会科学领域的更多协作研究网络使用本文介绍的开源工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80c7/4175511/ac90551b5e69/1742-7622-10-12-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80c7/4175511/10ee8de64ab7/1742-7622-10-12-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80c7/4175511/ac90551b5e69/1742-7622-10-12-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80c7/4175511/10ee8de64ab7/1742-7622-10-12-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80c7/4175511/ac90551b5e69/1742-7622-10-12-2.jpg

相似文献

1
Data harmonization and federated analysis of population-based studies: the BioSHaRE project.基于人群研究的数据协调与联合分析:BioSHaRE项目。
Emerg Themes Epidemiol. 2013 Nov 21;10(1):12. doi: 10.1186/1742-7622-10-12.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
A review of harmonization methods for studying dietary patterns.饮食模式研究的协调方法综述
Smart Health (Amst). 2022 Mar;23. doi: 10.1016/j.smhl.2021.100263. Epub 2022 Jan 13.
4
The federated database--a basis for biobank-based post-genome studies, integrating phenome and genome data from 600,000 twin pairs in Europe.联合数据库——基于生物样本库的后基因组研究的基础,整合了来自欧洲60万对双胞胎的表型组和基因组数据。
Eur J Hum Genet. 2007 Jul;15(7):718-23. doi: 10.1038/sj.ejhg.5201850. Epub 2007 May 9.
5
DataSHIELD: taking the analysis to the data, not the data to the analysis.数据护盾:将分析带到数据那里,而不是把数据带到分析这边。
Int J Epidemiol. 2014 Dec;43(6):1929-44. doi: 10.1093/ije/dyu188. Epub 2014 Sep 26.
6
European Project on OSteoArthritis (EPOSA): methodological challenges in harmonization of existing data from five European population-based cohorts on aging.欧洲骨关节炎项目(EPOSA):五个欧洲基于人群的老龄化队列中现有数据协调的方法学挑战。
BMC Musculoskelet Disord. 2011 Nov 28;12:272. doi: 10.1186/1471-2474-12-272.
7
The LifeCycle Project-EU Child Cohort Network: a federated analysis infrastructure and harmonized data of more than 250,000 children and parents.生命周期项目-EU 儿童队列网络:一个联合分析基础设施和 25 万多名儿童及其家长的协调数据。
Eur J Epidemiol. 2020 Jul;35(7):709-724. doi: 10.1007/s10654-020-00662-z. Epub 2020 Jul 23.
8
Access Governance for Biobanks: The Case of the BioSHaRE-EU Cohorts.生物样本库的访问管理:以BioSHaRE-EU队列为例。
Biopreserv Biobank. 2016 Jun;14(3):201-6. doi: 10.1089/bio.2015.0124. Epub 2016 May 16.
9
Privacy-Preserving Workflow for the Cross-Border Federated Analysis of Clinical Data.跨境联邦临床数据分析的隐私保护工作流程。
Stud Health Technol Inform. 2024 Aug 22;316:1637-1641. doi: 10.3233/SHTI240737.
10
Data harmonization and data pooling from cohort studies: a practical approach for data management.从队列研究中进行数据协调和数据池化:一种实用的数据管理方法。
Int J Popul Data Sci. 2021 Nov 30;6(1):1680. doi: 10.23889/ijpds.v6i1.1680. eCollection 2021.

引用本文的文献

1
Integrative Harmonization of Phenotypic and Genomic Data Improves Bone Mineral Density Prediction in Multi-Study Osteoporosis Research.表型和基因组数据的综合协调改善了多研究骨质疏松症研究中的骨密度预测。
medRxiv. 2025 May 13:2025.05.12.25327471. doi: 10.1101/2025.05.12.25327471.
2
ItemComplex: A Python-based visualization framework for ex-post organization and integration of large language-based datasets.ItemComplex:一个基于Python的可视化框架,用于事后组织和整合基于大语言的数据集。
Eur Psychiatry. 2025 May 26;68(1):e75. doi: 10.1192/j.eurpsy.2025.2457.
3
Data harmonization for the analysis of personalized treatment of psychosis with metacognitive training.

本文引用的文献

1
Data sharing in large research consortia: experiences and recommendations from ENGAGE.大型研究联盟中的数据共享:ENGAGE 的经验与建议。
Eur J Hum Genet. 2014 Mar;22(3):317-21. doi: 10.1038/ejhg.2013.131. Epub 2013 Jun 19.
2
Transforming epidemiology for 21st century medicine and public health.为 21 世纪的医学和公共卫生改变流行病学。
Cancer Epidemiol Biomarkers Prev. 2013 Apr;22(4):508-16. doi: 10.1158/1055-9965.EPI-13-0146. Epub 2013 Mar 5.
3
Pooling birth cohorts in allergy and asthma: European Union-funded initiatives - a MeDALL, CHICOS, ENRIECO, and GA²LEN joint paper.
用于通过元认知训练分析精神病个性化治疗的数据协调。
Sci Rep. 2025 Mar 24;15(1):10159. doi: 10.1038/s41598-025-94815-3.
4
Prospective harmonisation of four international randomised controlled trials in Canada, China, India and South Africa: the Healthy Life Trajectories Initiative.加拿大、中国、印度和南非四项国际随机对照试验的前瞻性协调:健康生活轨迹倡议。
BMJ Open. 2025 Mar 3;15(3):e086233. doi: 10.1136/bmjopen-2024-086233.
5
A Latent Trait-based Measure as a Data Harmonization and Missing Data Solution Applied to the Environmental Influences on Child Health Outcomes Cohort.一种基于潜在特质的测量方法作为数据协调和缺失数据解决方案应用于儿童健康结果队列的环境影响研究。
Epidemiology. 2025 May 1;36(3):413-424. doi: 10.1097/EDE.0000000000001832. Epub 2025 Apr 1.
6
MOLGENIS Armadillo: a lightweight server for federated analysis using DataSHIELD.MOLGENIS犰狳:一个使用DataSHIELD进行联合分析的轻量级服务器。
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae726.
7
Cohort profile: Worldwide Collaboration on OsteoArthritis prediCtion for the Hip (World COACH) - an international consortium of prospective cohort studies with individual participant data on hip osteoarthritis.队列资料简介:全球髋关节骨关节炎预测合作研究(World COACH)- 一个国际前瞻性队列研究联合体,拥有髋关节骨关节炎的个体参与者数据。
BMJ Open. 2024 Apr 18;14(4):e077907. doi: 10.1136/bmjopen-2023-077907.
8
Navigating data standards in public health: A brief report from a data-standards meeting.公共卫生领域的数据标准导航:数据标准会议简要报告
J Glob Health. 2024 Apr 5;14:03024. doi: 10.7189/jogh.14.03024.
9
Multi-omics subgroups associated with glycaemic deterioration in type 2 diabetes: an IMI-RHAPSODY Study.多组学亚组与 2 型糖尿病患者血糖恶化相关:一项 IMI-RHAPSODY 研究。
Front Endocrinol (Lausanne). 2024 Mar 6;15:1350796. doi: 10.3389/fendo.2024.1350796. eCollection 2024.
10
INSPIRE datahub: a pan-African integrated suite of services for harmonising longitudinal population health data using OHDSI tools.INSPIRE数据中心:一个使用OHDSI工具协调纵向人群健康数据的泛非综合服务套件。
Front Digit Health. 2024 Jan 29;6:1329630. doi: 10.3389/fdgth.2024.1329630. eCollection 2024.
汇集过敏和哮喘的队列研究:欧盟资助的研究项目 - MeDALL、CHICOS、ENRIECO 和 GA²LEN 的联合报告。
Int Arch Allergy Immunol. 2013;161(1):1-10. doi: 10.1159/000343018. Epub 2012 Dec 13.
4
'Metabolically healthy obesity': origins and implications.代谢健康型肥胖:起源与意义。
Mol Aspects Med. 2013 Feb;34(1):59-70. doi: 10.1016/j.mam.2012.10.004. Epub 2012 Oct 13.
5
A secure distributed logistic regression protocol for the detection of rare adverse drug events.一种用于检测罕见药物不良事件的安全分布式逻辑回归协议。
J Am Med Inform Assoc. 2013 May 1;20(3):453-61. doi: 10.1136/amiajnl-2011-000735. Epub 2012 Aug 7.
6
Toward a roadmap in global biobanking for health.迈向全球健康生物库的路线图。
Eur J Hum Genet. 2012 Nov;20(11):1105-11. doi: 10.1038/ejhg.2012.96. Epub 2012 Jun 20.
7
Toward interoperable bioscience data.迈向可互操作的生物科学数据
Nat Genet. 2012 Jan 27;44(2):121-6. doi: 10.1038/ng.1054.
8
Is rigorous retrospective harmonization possible? Application of the DataSHaPER approach across 53 large studies.严格的回顾性协调是否可行?DataSHaPER 方法在 53 项大型研究中的应用。
Int J Epidemiol. 2011 Oct;40(5):1314-28. doi: 10.1093/ije/dyr106. Epub 2011 Jul 30.
9
Towards a data sharing Code of Conduct for international genomic research.迈向国际基因组研究数据共享行为准则。
Genome Med. 2011 Jul 14;3(7):46. doi: 10.1186/gm262.
10
From single biobanks to international networks: developing e-governance.从单一生物库到国际网络:发展电子政务。
Hum Genet. 2011 Sep;130(3):377-82. doi: 10.1007/s00439-011-1063-0. Epub 2011 Jul 23.