Suppr超能文献

一种数据驱动的方法,用于在差分隐私下为临床试验数据共享选择隐私参数。

A data-driven approach to choosing privacy parameters for clinical trial data sharing under differential privacy.

机构信息

Study Design and Data Analysis, College of Public Health, University of South Florida, Tampa, FL 33612, United States.

Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN 46556, United States.

出版信息

J Am Med Inform Assoc. 2024 Apr 19;31(5):1135-1143. doi: 10.1093/jamia/ocae038.

Abstract

OBJECTIVES

Clinical trial data sharing is crucial for promoting transparency and collaborative efforts in medical research. Differential privacy (DP) is a formal statistical technique for anonymizing shared data that balances privacy of individual records and accuracy of replicated results through a "privacy budget" parameter, ε. DP is considered the state of the art in privacy-protected data publication and is underutilized in clinical trial data sharing. This study is focused on identifying ε values for the sharing of clinical trial data.

MATERIALS AND METHODS

We analyzed 2 clinical trial datasets with privacy budget ε ranging from 0.01 to 10. Smaller values of ε entail adding greater amounts of random noise, with better privacy as a result. Comparison of rates, odds ratios, means, and mean differences between the original clinical trial datasets and the empirical distribution of the DP estimator was performed.

RESULTS

The DP rate closely approximated the original rate of 6.5% when ε > 1. The DP odds ratio closely aligned with the original odds ratio of 0.689 when ε ≥ 3. The DP mean closely approximated the original mean of 164.64 when ε ≥ 1. As ε increased to 5, both the minimum and maximum DP means converged toward the original mean.

DISCUSSION

There is no consensus on how to choose the privacy budget ε. The definition of DP does not specify the required level of privacy, and there is no established formula for determining ε.

CONCLUSION

Our findings suggest that the application of DP holds promise in the context of sharing clinical trial data.

摘要

目的

临床试验数据共享对于促进医学研究的透明度和协作至关重要。差分隐私(DP)是一种通过“隐私预算”参数 ε 对共享数据进行匿名化的正式统计技术,该参数在平衡个体记录的隐私和复制结果的准确性方面发挥着作用。DP 被认为是隐私保护数据发布的最新技术,但在临床试验数据共享中并未得到充分利用。本研究旨在确定共享临床试验数据的 ε 值。

材料与方法

我们分析了两个隐私预算 ε 值范围为 0.01 至 10 的临床试验数据集。较小的 ε 值意味着需要添加更多的随机噪声,从而获得更好的隐私保护效果。对原始临床试验数据集和 DP 估计量的经验分布之间的比率、优势比、均值和均值差异进行了比较。

结果

当 ε >1 时,DP 率与原始的 6.5%率非常接近。当 ε≥3 时,DP 优势比与原始的 0.689 优势比非常吻合。当 ε≥1 时,DP 均值与原始均值 164.64 非常接近。当 ε 增加到 5 时,DP 均值的最小值和最大值都趋近于原始均值。

讨论

目前尚无关于如何选择隐私预算 ε 的共识。DP 的定义并未指定所需的隐私级别,也没有确定 ε 的既定公式。

结论

我们的研究结果表明,DP 在共享临床试验数据方面具有广阔的应用前景。

相似文献

2
Does Differentially Private Synthetic Data Lead to Synthetic Discoveries?差分隐私合成数据是否会导致合成发现?
Methods Inf Med. 2024 May;63(1-02):35-51. doi: 10.1055/a-2385-1355. Epub 2024 Aug 13.
9
Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing.隐私保护生成式深度神经网络支持临床数据共享。
Circ Cardiovasc Qual Outcomes. 2019 Jul;12(7):e005122. doi: 10.1161/CIRCOUTCOMES.118.005122. Epub 2019 Jul 9.

本文引用的文献

3
Data sharing and community-engaged research.数据共享和社区参与式研究。
Science. 2022 Oct 14;378(6616):141-143. doi: 10.1126/science.abq6851. Epub 2022 Oct 13.
7
Differential privacy in health research: A scoping review.健康研究中的差分隐私:范围综述。
J Am Med Inform Assoc. 2021 Sep 18;28(10):2269-2276. doi: 10.1093/jamia/ocab135.
10
Time for NIH to lead on data sharing.美国国立卫生研究院是时候引领数据共享了。
Science. 2020 Mar 20;367(6484):1308-1309. doi: 10.1126/science.aba4456.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验