Suppr超能文献

跨境联邦临床数据分析的隐私保护工作流程。

Privacy-Preserving Workflow for the Cross-Border Federated Analysis of Clinical Data.

机构信息

HLRS, University of Stuttgart, Germany.

HPC Department, CINECA Consorzio Interuniversitario, Italy.

出版信息

Stud Health Technol Inform. 2024 Aug 22;316:1637-1641. doi: 10.3233/SHTI240737.

Abstract

The motivation behind this research is to perform a privacy-preserving analysis of data located at remote sites and in different jurisdictions with no possibility of sharing individual-level information. Here, we present key findings from requirements analysis and a resulting federated data analysis workflow built using open-source research software, where patient-level information is securely stored and never exposed during the analysis process. We present additional improvements to further strengthen the security of the workflow. We emphasize and showcase the use of data harmonization in the analysis. The data analysis is done using the R language for statistical computing and DataSHIELD libraries for non-disclosive analysis of sensitive data. The workflow was validated against two data analysis scenarios, confirming the results obtained with a centralized analysis approach. The clinical datasets are part of the large Pan-European SARS-Cov-2 cohort, collected and managed by the ORCHESTRA project. We demonstrate the viability of establishing a cross-border federated data analysis framework and conducting an analysis without exposing patient-level information, achieving results equivalent to centralized non-secure analysis. However, it is vital to ensure requirements associated with data harmonization, anonymization and IT infrastructure to maintain availability, usability and data security.

摘要

本研究的动机是对位于远程站点和不同司法管辖区的数据进行隐私保护分析,并且不可能共享个人层面的信息。在这里,我们展示了需求分析的主要结果,并提出了一个使用开源研究软件构建的联邦数据分析工作流程,在该流程中,患者层面的信息在分析过程中是安全存储且不会被暴露的。我们还提出了进一步的改进措施,以进一步加强工作流程的安全性。我们强调并展示了在分析中使用数据协调的方法。数据分析使用 R 语言进行统计计算,并且使用 DataSHIELD 库对敏感数据进行非披露分析。该工作流程针对两种数据分析场景进行了验证,确认了与集中式分析方法获得的结果一致。临床数据集是由 ORCHESTRA 项目收集和管理的泛欧 SARS-CoV-2 队列的一部分。我们展示了建立跨境联邦数据分析框架的可行性,并进行了不暴露患者层面信息的分析,从而实现了与集中式非安全分析相当的结果。然而,确保与数据协调、匿名化和 IT 基础设施相关的要求以维持可用性、易用性和数据安全性至关重要。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验