Suppr超能文献

利用合成数据生成和联邦分析促进心血管健康的国际评估比较。

A comparison of synthetic data generation and federated analysis for enabling international evaluations of cardiovascular health.

机构信息

Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, 5252 De Maisonneuve Blvd, Office 2B.39, Montréal, QC, H4A 3S5, Canada.

Department of Internal Medicine III, Division of Endocrinology and Metabolism, Gender Medicine Unit, Medical University of Vienna, Vienna, Austria.

出版信息

Sci Rep. 2023 Jul 17;13(1):11540. doi: 10.1038/s41598-023-38457-3.

Abstract

Sharing health data for research purposes across international jurisdictions has been a challenge due to privacy concerns. Two privacy enhancing technologies that can enable such sharing are synthetic data generation (SDG) and federated analysis, but their relative strengths and weaknesses have not been evaluated thus far. In this study we compared SDG with federated analysis to enable such international comparative studies. The objective of the analysis was to assess country-level differences in the role of sex on cardiovascular health (CVH) using a pooled dataset of Canadian and Austrian individuals. The Canadian data was synthesized and sent to the Austrian team for analysis. The utility of the pooled (synthetic Canadian + real Austrian) dataset was evaluated by comparing the regression results from the two approaches. The privacy of the Canadian synthetic data was assessed using a membership disclosure test which showed an F1 score of 0.001, indicating low privacy risk. The outcome variable of interest was CVH, calculated through a modified CANHEART index. The main and interaction effect parameter estimates of the federated and pooled analyses were consistent and directionally the same. It took approximately one month to set up the synthetic data generation platform and generate the synthetic data, whereas it took over 1.5 years to set up the federated analysis system. Synthetic data generation can be an efficient and effective tool for enabling multi-jurisdictional studies while addressing privacy concerns.

摘要

由于隐私问题,在国际司法管辖区之间共享医疗数据以进行研究一直是一个挑战。两种可以实现这种共享的隐私增强技术是合成数据生成 (SDG) 和联邦分析,但迄今为止尚未评估它们的相对优势和劣势。在这项研究中,我们比较了 SDG 和联邦分析,以实现这种国际比较研究。分析的目的是使用加拿大和奥地利个体的汇总数据集评估性别对心血管健康 (CVH) 的作用的国家间差异。加拿大的数据被合成并发送给奥地利团队进行分析。通过比较两种方法的回归结果,评估了汇总(合成的加拿大+真实的奥地利)数据集的实用性。使用成员披露测试评估了加拿大合成数据的隐私性,该测试显示 F1 得分为 0.001,表明隐私风险低。感兴趣的结果变量是 CVH,通过修改后的 CANHEART 指数计算得出。联邦分析和汇总分析的主要和交互作用参数估计是一致的,方向相同。建立合成数据生成平台并生成合成数据大约需要一个月的时间,而建立联邦分析系统则需要 1.5 年以上的时间。合成数据生成可以是一种有效且高效的工具,可用于在解决隐私问题的同时进行多司法管辖区研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f48/10352377/9251ccbdf37f/41598_2023_38457_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验