Suppr超能文献

基于人群研究的数据协调与联合分析:BioSHaRE项目。

Data harmonization and federated analysis of population-based studies: the BioSHaRE project.

作者信息

Doiron Dany, Burton Paul, Marcon Yannick, Gaye Amadou, Wolffenbuttel Bruce H R, Perola Markus, Stolk Ronald P, Foco Luisa, Minelli Cosetta, Waldenberger Melanie, Holle Rolf, Kvaløy Kirsti, Hillege Hans L, Tassé Anne-Marie, Ferretti Vincent, Fortier Isabel

机构信息

Research Institute of the McGill University Health Centre, 2155 Guy, office 458, Montreal, Quebec H3H 2R9, Canada.

Public Population Project in Genomics and Society, Montreal, Canada.

出版信息

Emerg Themes Epidemiol. 2013 Nov 21;10(1):12. doi: 10.1186/1742-7622-10-12.

Abstract

BACKGROUND

Individual-level data pooling of large population-based studies across research centres in international research projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence in the European Union) project aims to address these issues by building a collaborative group of investigators and developing tools for data harmonization, database integration and federated data analyses.

METHODS

Eight population-based studies in six European countries were recruited to participate in the BioSHaRE project. Through workshops, teleconferences and electronic communications, participating investigators identified a set of 96 variables targeted for harmonization to answer research questions of interest. Using each study's questionnaires, standard operating procedures, and data dictionaries, harmonization potential was assessed. Whenever harmonization was deemed possible, processing algorithms were developed and implemented in an open-source software infrastructure to transform study-specific data into the target (i.e. harmonized) format. Harmonized datasets located on server in each research centres across Europe were interconnected through a federated database system to perform statistical analysis.

RESULTS

Retrospective harmonization led to the generation of common format variables for 73% of matches considered (96 targeted variables across 8 studies). Authenticated investigators can now perform complex statistical analyses of harmonized datasets stored on distributed servers without actually sharing individual-level data using the DataSHIELD method.

CONCLUSION

New Internet-based networking technologies and database management systems are providing the means to support collaborative, multi-center research in an efficient and secure manner. The results from this pilot project show that, given a strong collaborative relationship between participating studies, it is possible to seamlessly co-analyse internationally harmonized research databases while allowing each study to retain full control over individual-level data. We encourage additional collaborative research networks in epidemiology, public health, and the social sciences to make use of the open source tools presented herein.

摘要

背景

在国际研究项目中,跨研究中心对基于大量人群的研究进行个体层面的数据整合面临诸多障碍。BioSHaRE(欧盟卓越研究生物样本库标准化与协调)项目旨在通过组建一个调查人员协作小组并开发数据协调、数据库整合和联合数据分析工具来解决这些问题。

方法

招募了六个欧洲国家的八项基于人群的研究参与BioSHaRE项目。通过研讨会、电话会议和电子通信,参与的调查人员确定了一组96个旨在协调的变量,以回答感兴趣的研究问题。利用每项研究的问卷、标准操作程序和数据字典,评估了协调潜力。只要认为有可能进行协调,就会开发处理算法并在开源软件基础设施中实施,以将特定研究的数据转换为目标(即协调后的)格式。欧洲各地各研究中心服务器上的协调数据集通过联合数据库系统相互连接,以进行统计分析。

结果

回顾性协调为73%的考虑匹配项(八项研究中的96个目标变量)生成了通用格式变量。经过认证的调查人员现在可以使用DataSHIELD方法对存储在分布式服务器上的协调数据集进行复杂的统计分析,而无需实际共享个体层面的数据。

结论

新的基于互联网的网络技术和数据库管理系统正在提供手段,以高效、安全的方式支持协作性多中心研究。这个试点项目的结果表明,鉴于参与研究之间有强大的协作关系,有可能无缝地共同分析国际协调的研究数据库,同时允许每项研究对个体层面的数据保持完全控制。我们鼓励流行病学、公共卫生和社会科学领域的更多协作研究网络使用本文介绍的开源工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80c7/4175511/10ee8de64ab7/1742-7622-10-12-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验