Molnár Viktor, Cs Sági Judit, Molnár Mária Judit
1 Semmelweis Egyetem, Általános Orvostudományi Kar, Genomikai Medicina és Ritka Betegségek Intézete Budapest, Üllői út 26., 1085 Magyarország.
Orv Hetil. 2023 May 28;164(21):811-819. doi: 10.1556/650.2023.32759.
Fragmentation of health data and biomedical research data is a major obstacle for precision medicine based on data-driven decisions. The development of personalized medicine requires the efficient exploitation of health data resources that are extraordinary in size and complexity, but highly fragmented, as well as technologies that enable data sharing across institutions and even borders. Biobanks are both sample archives and data integration centers. The analysis of large biobank data warehouses in federated datasets promises to yield conclusions with higher statistical power. A prerequisite for data sharing is harmonization, i.e., the mapping of the unique clinical and molecular characteristics of samples into a unified data model and standard codes. These databases, which are aligned to a common schema, then make healthcare information available for privacy-preserving federated data sharing and learning. The re-evaluation of sensitive health data is inconceivable without the protection of privacy, the legal and conceptual framework for which is set out in the GDPR (General Data Protection Regulation) and the FAIR (findable, accessible, interoperable, reusable) principles. For biobanks in Europe, the BBMRI-ERIC (Biobanking and Biomolecular Research Infrastructure - European Research Infrastructure Consortium) research infrastructure develops common guidelines, which the Hungarian BBMRI Node joined in 2021. As the first step, a federation of biobanks can connect fragmented datasets, providing high-quality data sets motivated by multiple research goals. Extending the approach to real-word data could also allow for higher level evaluation of data generated in the real world of patient care, and thus take the evidence generated in clinical trials within a rigorous framework to a new level. In this publication, we present the potential of federated data sharing in the context of the Semmelweis University Biobanks joint project. Orv Hetil. 2023; 164(21): 811-819.
健康数据和生物医学研究数据的碎片化是基于数据驱动决策的精准医学的主要障碍。个性化医疗的发展需要有效利用规模庞大、复杂但高度碎片化的健康数据资源,以及能够实现跨机构甚至跨境数据共享的技术。生物样本库既是样本档案库,也是数据整合中心。对联合数据集中的大型生物样本库数据仓库进行分析,有望得出具有更高统计效力的结论。数据共享的一个先决条件是协调统一,即将样本独特的临床和分子特征映射到统一的数据模型和标准代码中。这些与通用模式对齐的数据库,随后使医疗保健信息可用于保护隐私的联合数据共享和学习。如果没有隐私保护,对敏感健康数据的重新评估是不可想象的,GDPR(通用数据保护条例)和FAIR(可查找、可访问、可互操作、可重用)原则为此提供了法律和概念框架。对于欧洲的生物样本库,BBMRI-ERIC(生物样本库和生物分子研究基础设施 - 欧洲研究基础设施联盟)研究基础设施制定了通用指南,匈牙利BBMRI节点于2021年加入。第一步,生物样本库联盟可以连接碎片化的数据集,提供受多个研究目标驱动的高质量数据集。将该方法扩展到真实世界数据,还可以对患者护理实际场景中产生的数据进行更高层次的评估,从而在严格框架内将临床试验中产生的证据提升到一个新水平。在本出版物中,我们展示了塞梅尔维斯大学联合生物样本库项目背景下联合数据共享的潜力。《匈牙利医学周报》。2023年;164(21): 811 - 819。