具有分布式多站点数据的私有连续生存分析

Private Continuous Survival Analysis with Distributed Multi-Site Data.

作者信息

Bonomi Luca, Lionts Marilyn, Fan Liyue

机构信息

Dept. Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN.

Dept. Computer Science, Vanderbilt University, Nashville, TN.

出版信息

Proc IEEE Int Conf Big Data. 2023 Dec;2023:5444-5453. doi: 10.1109/BigData59044.2023.10386571.

DOI:10.1109/BigData59044.2023.10386571

PMID:38585488

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10997374/

Abstract

Effective disease surveillance systems require large-scale epidemiological data to improve health outcomes and quality of care for the general population. As data may be limited within a single site, multi-site data (e.g., from a number of local/regional health systems) need to be considered. Leveraging distributed data across multiple sites for epidemiological analysis poses significant challenges. Due to the sensitive nature of epidemiological data, it is imperative to design distributed solutions that provide strong privacy protections. Current privacy solutions often assume a central site, which is responsible for aggregating the distributed data and applying privacy protection before sharing the results (e.g., aggregation via secure primitives and differential privacy for sharing aggregate results). However, identifying such a central site may be difficult in practice and relying on a central site may introduce potential vulnerabilities (e.g., single point of failure). Furthermore, to support clinical interventions and inform policy decisions in a timely manner, epidemiological analysis need to reflect dynamic changes in the data. Yet, existing distributed privacy-protecting approaches were largely designed for static data (e.g., one-time data sharing) and cannot fulfill dynamic data requirements. In this work, we propose a privacy-protecting approach that supports the sharing of dynamic epidemiological analysis and provides strong privacy protection in a decentralized manner. We apply our solution in continuous survival analysis using the Kaplan-Meier estimation model while providing differential privacy protection. Our evaluations on a real dataset containing COVID-19 cases show that our method provides highly usable results.

摘要

有效的疾病监测系统需要大规模的流行病学数据，以改善普通人群的健康状况和医疗服务质量。由于单个地点的数据可能有限，因此需要考虑多地点数据（例如，来自多个地方/区域卫生系统的数据）。利用多个地点的分布式数据进行流行病学分析面临重大挑战。由于流行病学数据的敏感性，设计提供强大隐私保护的分布式解决方案势在必行。当前的隐私解决方案通常假定有一个中心站点，该站点负责汇总分布式数据并在共享结果之前应用隐私保护（例如，通过安全原语进行汇总并使用差分隐私来共享汇总结果）。然而，在实践中确定这样一个中心站点可能很困难，而且依赖中心站点可能会引入潜在漏洞（例如，单点故障）。此外，为了及时支持临床干预并为政策决策提供依据，流行病学分析需要反映数据的动态变化。然而，现有的分布式隐私保护方法主要是为静态数据（例如，一次性数据共享）设计的，无法满足动态数据的需求。在这项工作中，我们提出了一种隐私保护方法，该方法支持动态流行病学分析的共享，并以分散的方式提供强大的隐私保护。我们将我们的解决方案应用于使用Kaplan-Meier估计模型的连续生存分析，同时提供差分隐私保护。我们对包含COVID-19病例的真实数据集的评估表明，我们的方法提供了高度可用的结果。

相似文献

Private Continuous Survival Analysis with Distributed Multi-Site Data.具有分布式多站点数据的私有连续生存分析

Proc IEEE Int Conf Big Data. 2023 Dec;2023:5444-5453. doi: 10.1109/BigData59044.2023.10386571.

Distributed clinical data sharing via dynamic access-control policy transformation.通过动态访问控制策略转换实现分布式临床数据共享。

Int J Med Inform. 2016 May;89:25-31. doi: 10.1016/j.ijmedinf.2016.02.002. Epub 2016 Feb 12.

COINSTAC: A Privacy Enabled Model and Prototype for Leveraging and Processing Decentralized Brain Imaging Data.COINSTAC：一种用于利用和处理去中心化脑成像数据的隐私保护模型及原型。

Front Neurosci. 2016 Aug 19;10:365. doi: 10.3389/fnins.2016.00365. eCollection 2016.

The FeatureCloud Platform for Federated Learning in Biomedicine: Unified Approach.FeatureCloud 平台在生物医学领域的联邦学习：统一方法。

J Med Internet Res. 2023 Jul 12;25:e42621. doi: 10.2196/42621.

Protecting patient privacy in survival analyses.保护生存分析中的患者隐私。

J Am Med Inform Assoc. 2020 Mar 1;27(3):366-375. doi: 10.1093/jamia/ocz195.

Validity of Privacy-Protecting Analytical Methods That Use Only Aggregate-Level Information to Conduct Multivariable-Adjusted Analysis in Distributed Data Networks.仅使用汇总级信息在分布式数据网络中进行多变量调整分析的隐私保护分析方法的有效性。

Am J Epidemiol. 2019 Apr 1;188(4):709-723. doi: 10.1093/aje/kwy265.

Privacy-preserving data sharing infrastructures for medical research: systematization and comparison.用于医学研究的隐私保护数据共享基础架构：系统梳理与比较。

BMC Med Inform Decis Mak. 2021 Aug 12;21(1):242. doi: 10.1186/s12911-021-01602-x.

Enabling Health Data Sharing with Fine-Grained Privacy.实现具有细粒度隐私的健康数据共享。

Proc ACM Int Conf Inf Knowl Manag. 2023 Oct;2023:131-141. doi: 10.1145/3583780.3614864. Epub 2023 Oct 21.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Sharing personal ECG time-series data privately.私下分享个人心电图时间序列数据。

J Am Med Inform Assoc. 2022 Jun 14;29(7):1152-1160. doi: 10.1093/jamia/ocac047.

本文引用的文献

Mitigating Membership Inference in Deep Survival Analyses with Differential Privacy.通过差分隐私减轻深度生存分析中的成员推理

Proc (IEEE Int Conf Healthc Inform). 2023 Jun;2023:81-90. doi: 10.1109/ichi57859.2023.00022. Epub 2023 Dec 11.

Privacy-aware multi-institutional time-to-event studies.隐私感知多机构事件发生时间研究

PLOS Digit Health. 2022 Sep 6;1(9):e0000101. doi: 10.1371/journal.pdig.0000101. eCollection 2022 Sep.

VERTICOX: Vertically Distributed Cox Proportional Hazards Model Using the Alternating Direction Method of Multipliers.VERTICOX：使用交替方向乘子法的垂直分布Cox比例风险模型。

IEEE Trans Knowl Data Eng. 2022 Feb;34(2):996-1010. doi: 10.1109/tkde.2020.2989301. Epub 2020 Apr 22.

Sharing Time-to-Event Data with Privacy Protection.在保护隐私的前提下共享事件发生时间数据。

Proc (IEEE Int Conf Healthc Inform). 2022 Jun;2022. doi: 10.1109/ichi54592.2022.00014. Epub 2022 Sep 8.

Dynamically adjusting case reporting policy to maximize privacy and public health utility in the face of a pandemic.在面对大流行时，动态调整病例报告政策以最大化隐私和公共卫生效益。

J Am Med Inform Assoc. 2022 Apr 13;29(5):853-863. doi: 10.1093/jamia/ocac011.

Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption.多方同态加密实现精准医学真正隐私保护的联邦分析。

Nat Commun. 2021 Oct 11;12(1):5910. doi: 10.1038/s41467-021-25972-y.

Protecting Privacy and Transforming COVID-19 Case Surveillance Datasets for Public Use.保护隐私并转换 COVID-19 病例监测数据集以进行公共使用。

Public Health Rep. 2021 Sep-Oct;136(5):554-561. doi: 10.1177/00333549211026817. Epub 2021 Jun 17.

Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records.利用深度学习从电子健康记录中开发不良事件预测的连续风险模型。

Nat Protoc. 2021 Jun;16(6):2765-2787. doi: 10.1038/s41596-021-00513-5. Epub 2021 May 5.

Federated Learning for Healthcare Informatics.医疗信息学中的联邦学习

J Healthc Inform Res. 2021;5(1):1-19. doi: 10.1007/s41666-020-00082-4. Epub 2020 Nov 12.

The future of digital health with federated learning.联合学习助力数字健康的未来。

NPJ Digit Med. 2020 Sep 14;3:119. doi: 10.1038/s41746-020-00323-1. eCollection 2020.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验