高效差分隐私学习提高药物敏感性预测。

Efficient differentially private learning improves drug sensitivity prediction.

机构信息

Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland.

Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.

出版信息

Biol Direct. 2018 Feb 6;13(1):1. doi: 10.1186/s13062-017-0203-4.

DOI:10.1186/s13062-017-0203-4

PMID:29409513

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5801888/

Abstract

BACKGROUND

Users of a personalised recommendation system face a dilemma: recommendations can be improved by learning from data, but only if other users are willing to share their private information. Good personalised predictions are vitally important in precision medicine, but genomic information on which the predictions are based is also particularly sensitive, as it directly identifies the patients and hence cannot easily be anonymised. Differential privacy has emerged as a potentially promising solution: privacy is considered sufficient if presence of individual patients cannot be distinguished. However, differentially private learning with current methods does not improve predictions with feasible data sizes and dimensionalities.

RESULTS

We show that useful predictors can be learned under powerful differential privacy guarantees, and even from moderately-sized data sets, by demonstrating significant improvements in the accuracy of private drug sensitivity prediction with a new robust private regression method. Our method matches the predictive accuracy of the state-of-the-art non-private lasso regression using only 4x more samples under relatively strong differential privacy guarantees. Good performance with limited data is achieved by limiting the sharing of private information by decreasing the dimensionality and by projecting outliers to fit tighter bounds, therefore needing to add less noise for equal privacy.

CONCLUSIONS

The proposed differentially private regression method combines theoretical appeal and asymptotic efficiency with good prediction accuracy even with moderate-sized data. As already the simple-to-implement method shows promise on the challenging genomic data, we anticipate rapid progress towards practical applications in many fields.

REVIEWERS

This article was reviewed by Zoltan Gaspari and David Kreil.

摘要

背景

个性化推荐系统的用户面临着一个困境：推荐可以通过从数据中学习来改进，但前提是其他用户愿意共享他们的私人信息。精准医学中，良好的个性化预测至关重要，但预测所基于的基因组信息也特别敏感，因为它直接识别患者，因此难以匿名化。差分隐私已成为一种有前途的潜在解决方案：如果无法区分个体患者的存在，则认为隐私是足够的。然而，使用当前方法进行差分隐私学习并不能在可行的数据大小和维度下提高预测精度。

结果

我们通过展示一种新的稳健私有回归方法在私人药物敏感性预测的准确性方面取得了显著的提高，证明了在强大的差分隐私保证下，可以学习到有用的预测器，甚至可以从中等大小的数据集中学习到有用的预测器。我们的方法在相对较强的差分隐私保证下，仅使用 4 倍的数据就可以匹配最先进的非私有 lasso 回归的预测精度。通过限制私人信息的共享，通过降低维度和将异常值投影到更紧的边界来拟合，从而为了获得相同的隐私而需要添加更少的噪声，从而在有限的数据下实现了良好的性能。

结论

所提出的差分隐私回归方法结合了理论吸引力和渐近效率，即使在中等大小的数据下，也具有良好的预测精度。由于简单实现的方法已经在具有挑战性的基因组数据上显示出了前景，我们预计在许多领域的实际应用中将会迅速取得进展。

评论者

本文由 Zoltan Gaspari 和 David Kreil 进行了评审。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d34/5801888/c92354c9bd3d/13062_2017_203_Fig1_HTML.jpg

相似文献

Efficient differentially private learning improves drug sensitivity prediction.

Biol Direct. 2018 Feb 6;13(1):1. doi: 10.1186/s13062-017-0203-4.

Representation transfer for differentially private drug sensitivity prediction.

Bioinformatics. 2019 Jul 15;35(14):i218-i224. doi: 10.1093/bioinformatics/btz373.

Generalized genomic data sharing for differentially private federated learning.

J Biomed Inform. 2022 Aug;132:104113. doi: 10.1016/j.jbi.2022.104113. Epub 2022 Jun 9.

Differentially Private Empirical Risk Minimization.

J Mach Learn Res. 2011 Mar;12:1069-1109.

Differentially private distributed logistic regression using private and public data.

BMC Med Genomics. 2014;7 Suppl 1(Suppl 1):S14. doi: 10.1186/1755-8794-7-S1-S14. Epub 2014 May 8.

A multicenter random forest model for effective prognosis prediction in collaborative clinical research network.

Artif Intell Med. 2020 Mar;103:101814. doi: 10.1016/j.artmed.2020.101814. Epub 2020 Feb 5.

Utility-Based Differentially Private Recommendation System.

Big Data. 2021 Jun;9(3):203-218. doi: 10.1089/big.2020.0038. Epub 2021 Mar 18.

Dynamic Privacy Budget Allocation Improves Data Efficiency of Differentially Private Gradient Descent.

FACCT 2022 (2022). 2022 Jun;2022:11-35. doi: 10.1145/3531146.3533070. Epub 2022 Jun 20.

DPSynthesizer: Differentially Private Data Synthesizer for Privacy Preserving Data Sharing.

Proceedings VLDB Endowment. 2014 Aug;7(13):1677-1680. doi: 10.14778/2733004.2733059.

Differentially private multivariate time series forecasting of aggregated human mobility with deep learning: Input or gradient perturbation?

Neural Comput Appl. 2022;34(16):13355-13369. doi: 10.1007/s00521-022-07393-0. Epub 2022 Jun 3.

引用本文的文献

Decades in the Making: The Evolution of Digital Health Research Infrastructure Through Synthetic Data, Common Data Models, and Federated Learning.

J Med Internet Res. 2024 Dec 20;26:e58637. doi: 10.2196/58637.

Impact of Artificial Intelligence (AI) Technology in Healthcare Sector: A Critical Evaluation of Both Sides of the Coin.

Clin Pathol. 2024 Jan 22;17:2632010X241226887. doi: 10.1177/2632010X241226887. eCollection 2024 Jan-Dec.

Predicting anticancer drug sensitivity on distributed data sources using federated deep learning.

Heliyon. 2023 Aug 1;9(8):e18615. doi: 10.1016/j.heliyon.2023.e18615. eCollection 2023 Aug.

Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data.

Front Oncol. 2022 Jun 23;12:879607. doi: 10.3389/fonc.2022.879607. eCollection 2022.

Privacy-Preserving Artificial Intelligence Techniques in Biomedicine.

Methods Inf Med. 2022 Jun;61(S 01):e12-e27. doi: 10.1055/s-0041-1740630. Epub 2022 Jan 21.

Differential privacy in health research: A scoping review.

J Am Med Inform Assoc. 2021 Sep 18;28(10):2269-2276. doi: 10.1093/jamia/ocab135.

Representation transfer for differentially private drug sensitivity prediction.

Bioinformatics. 2019 Jul 15;35(14):i218-i224. doi: 10.1093/bioinformatics/btz373.

Anticancer Drug Response Prediction in Cell Lines Using Weighted Graph Regularized Matrix Factorization.

Mol Ther Nucleic Acids. 2019 Sep 6;17:164-174. doi: 10.1016/j.omtn.2019.05.017. Epub 2019 Jun 4.

本文引用的文献

PyMC: a modern, and comprehensive probabilistic programming framework in Python.

PeerJ Comput Sci. 2023 Sep 1;9:e1516. doi: 10.7717/peerj-cs.1516. eCollection 2023.

Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing.

Proc USENIX Secur Symp. 2014 Aug;2014:17-32.

Quantification of private information leakage from phenotype-genotype data: linking attacks.

Nat Methods. 2016 Mar;13(3):251-6. doi: 10.1038/nmeth.3746. Epub 2016 Feb 1.

Realizing privacy preserving genome-wide association studies.

Bioinformatics. 2016 May 1;32(9):1293-300. doi: 10.1093/bioinformatics/btw009. Epub 2016 Jan 14.

Privacy in the Genomic Era.

ACM Comput Surv. 2015 Sep;48(1). doi: 10.1145/2767007.

Privacy-Preserving Data Sharing for Genome-Wide Association Studies.

J Priv Confid. 2013;5(1):137-166.

Differentially private distributed logistic regression using private and public data.

BMC Med Genomics. 2014;7 Suppl 1(Suppl 1):S14. doi: 10.1186/1755-8794-7-S1-S14. Epub 2014 May 8.

A community effort to assess and improve drug sensitivity prediction algorithms.

Nat Biotechnol. 2014 Dec;32(12):1202-12. doi: 10.1038/nbt.2877. Epub 2014 Jun 1.

Scalable privacy-preserving data sharing methodology for genome-wide association studies.

J Biomed Inform. 2014 Aug;50:133-41. doi: 10.1016/j.jbi.2014.01.008. Epub 2014 Feb 6.

Identifying personal genomes by surname inference.

Science. 2013 Jan 18;339(6117):321-4. doi: 10.1126/science.1229566.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

高效差分隐私学习提高药物敏感性预测。

Efficient differentially private learning improves drug sensitivity prediction.

机构信息

Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland.

Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.