• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用联邦方法在真实环境中进行常规统计分析的能力和准确性。

Capability and accuracy of usual statistical analyses in a real-world setting using a federated approach.

机构信息

Keyrus Life Science, Nantes, France.

Roche Medical Data Center, Boulogne-Billancourt, France.

出版信息

PLoS One. 2024 Nov 14;19(11):e0312697. doi: 10.1371/journal.pone.0312697. eCollection 2024.

DOI:10.1371/journal.pone.0312697
Abstract

METHODS

The objective of this project was to determine the capability of a federated analysis approach using DataSHIELD to maintain the level of results of a classical centralized analysis in a real-world setting. This research was carried out on an anonymous synthetic longitudinal real-world oncology cohort randomly splitted in three local databases, mimicking three healthcare organizations, stored in a federated data platform integrating DataSHIELD. No individual data transfer, statistics were calculated simultaneously but in parallel within each healthcare organization and only summary statistics (aggregates) were provided back to the federated data analyst. Descriptive statistics, survival analysis, regression models and correlation were first performed on the centralized approach and then reproduced on the federated approach. The results were then compared between the two approaches.

RESULTS

The cohort was splitted in three samples (N1 = 157 patients, N2 = 94 and N3 = 64), 11 derived variables and four types of analyses were generated. All analyses were successfully reproduced using DataSHIELD, except for one descriptive variable due to data disclosure limitation in the federated environment, showing the good capability of DataSHIELD. For descriptive statistics, exactly equivalent results were found for the federated and centralized approaches, except some differences for position measures. Estimates of univariate regression models were similar, with a loss of accuracy observed for multivariate models due to source database variability.

CONCLUSION

Our project showed a practical implementation and use case of a real-world federated approach using DataSHIELD. The capability and accuracy of common data manipulation and analysis were satisfying, and the flexibility of the tool enabled the production of a variety of analyses while preserving the privacy of individual data. The DataSHIELD forum was also a practical source of information and support. In order to find the right balance between privacy and accuracy of the analysis, set-up of privacy requirements should be established prior to the start of the analysis, as well as a data quality review of the participating healthcare organization.

摘要

方法

本项目旨在确定使用 DataSHIELD 的联邦分析方法在真实环境中保持经典集中分析结果水平的能力。这项研究是在一个匿名的合成纵向真实肿瘤队列上进行的,该队列随机分为三个本地数据库,模拟三个医疗保健组织,存储在一个整合了 DataSHIELD 的联邦数据平台中。没有进行任何个人数据传输,统计数据是在每个医疗保健组织内同时但并行计算的,并且只向联邦数据分析师提供汇总统计信息(聚合)。首先在集中式方法上进行描述性统计、生存分析、回归模型和相关性分析,然后在联邦式方法上进行复制。然后比较两种方法的结果。

结果

该队列被分为三个样本(N1=157 例,N2=94 例,N3=64 例),11 个衍生变量和生成了四种类型的分析。除了由于联邦环境中的数据披露限制,有一个描述性变量无法使用 DataSHIELD 进行复制外,所有分析都成功地在 DataSHIELD 上进行了复制,显示了 DataSHIELD 的良好能力。对于描述性统计,在联邦和集中式方法中都找到了完全相同的结果,除了一些位置度量的差异。单变量回归模型的估计值相似,由于源数据库的可变性,多变量模型的准确性降低。

结论

我们的项目展示了使用 DataSHIELD 的真实联邦方法的实际实施和用例。常见数据操作和分析的能力和准确性令人满意,并且工具的灵活性使各种分析能够在保护个人数据隐私的同时进行。DataSHIELD 论坛也是一个实用的信息和支持来源。为了在隐私和分析准确性之间找到正确的平衡,应该在分析开始之前建立隐私要求,并对参与的医疗保健组织进行数据质量审查。

相似文献

1
Capability and accuracy of usual statistical analyses in a real-world setting using a federated approach.使用联邦方法在真实环境中进行常规统计分析的能力和准确性。
PLoS One. 2024 Nov 14;19(11):e0312697. doi: 10.1371/journal.pone.0312697. eCollection 2024.
2
DataSHIELD: taking the analysis to the data, not the data to the analysis.数据护盾:将分析带到数据那里,而不是把数据带到分析这边。
Int J Epidemiol. 2014 Dec;43(6):1929-44. doi: 10.1093/ije/dyu188. Epub 2014 Sep 26.
3
dsSurvival 2.0: privacy enhancing survival curves for survival models in the federated DataSHIELD analysis system.dsSurvival 2.0:在联邦化的 DataSHIELD 分析系统中,用于生存模型的增强隐私保护的生存曲线。
BMC Res Notes. 2023 Jun 6;16(1):98. doi: 10.1186/s13104-023-06372-5.
4
dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD.dsSurvival:在 DataSHIELD 中用于联合个体患者荟萃分析的隐私保护生存模型。
BMC Res Notes. 2022 Jun 3;15(1):197. doi: 10.1186/s13104-022-06085-1.
5
Federated difference-in-differences with multiple time periods in DataSHIELD.DataSHIELD中多时间段的联邦双重差分法
iScience. 2024 Oct 9;27(11):111025. doi: 10.1016/j.isci.2024.111025. eCollection 2024 Nov 15.
6
Deep generative models in DataSHIELD.DataSHIELD 中的深度生成模型。
BMC Med Res Methodol. 2021 Apr 3;21(1):64. doi: 10.1186/s12874-021-01237-6.
7
Privacy-Preserving Workflow for the Cross-Border Federated Analysis of Clinical Data.跨境联邦临床数据分析的隐私保护工作流程。
Stud Health Technol Inform. 2024 Aug 22;316:1637-1641. doi: 10.3233/SHTI240737.
8
dsSynthetic: synthetic data generation for the DataSHIELD federated analysis system.dsSynthetic:用于 DataSHIELD 联邦分析系统的合成数据生成。
BMC Res Notes. 2022 Jun 27;15(1):230. doi: 10.1186/s13104-022-06111-2.
9
Privacy-Preserving Federated Survival Support Vector Machines for Cross-Institutional Time-To-Event Analysis: Algorithm Development and Validation.用于跨机构事件发生时间分析的隐私保护联合生存支持向量机:算法开发与验证
JMIR AI. 2024 Mar 29;3:e47652. doi: 10.2196/47652.
10
Privacy-preserving federated machine learning on FAIR health data: A real-world application.公平健康数据上的隐私保护联邦机器学习:一个实际应用
Comput Struct Biotechnol J. 2024 Feb 17;24:136-145. doi: 10.1016/j.csbj.2024.02.014. eCollection 2024 Dec.

本文引用的文献

1
dsSurvival 2.0: privacy enhancing survival curves for survival models in the federated DataSHIELD analysis system.dsSurvival 2.0:在联邦化的 DataSHIELD 分析系统中,用于生存模型的增强隐私保护的生存曲线。
BMC Res Notes. 2023 Jun 6;16(1):98. doi: 10.1186/s13104-023-06372-5.
2
dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD.dsSurvival:在 DataSHIELD 中用于联合个体患者荟萃分析的隐私保护生存模型。
BMC Res Notes. 2022 Jun 3;15(1):197. doi: 10.1186/s13104-022-06085-1.
3
Bridging the Data-Sharing Divide - Seeing the Devil in the Details, Not the Other Camp.
弥合数据共享鸿沟——明察细节中的问题,而非将对方阵营视为问题所在。
N Engl J Med. 2017 Jun 8;376(23):2201-2203. doi: 10.1056/NEJMp1704482. Epub 2017 Apr 26.
4
DataSHIELD: taking the analysis to the data, not the data to the analysis.数据护盾:将分析带到数据那里,而不是把数据带到分析这边。
Int J Epidemiol. 2014 Dec;43(6):1929-44. doi: 10.1093/ije/dyu188. Epub 2014 Sep 26.