隐私保护的逻辑回归训练。

Privacy-preserving logistic regression training.

机构信息

imec-Cosic, Dept. Electrical Engineering, KU Leuven, Kasteelpark Arenberg 10, Leuven, Belgium.

出版信息

BMC Med Genomics. 2018 Oct 11;11(Suppl 4):86. doi: 10.1186/s12920-018-0398-y.

DOI:10.1186/s12920-018-0398-y

PMID:30309364

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6180357/

Abstract

BACKGROUND

Logistic regression is a popular technique used in machine learning to construct classification models. Since the construction of such models is based on computing with large datasets, it is an appealing idea to outsource this computation to a cloud service. The privacy-sensitive nature of the input data requires appropriate privacy preserving measures before outsourcing it. Homomorphic encryption enables one to compute on encrypted data directly, without decryption and can be used to mitigate the privacy concerns raised by using a cloud service.

METHODS

In this paper, we propose an algorithm (and its implementation) to train a logistic regression model on a homomorphically encrypted dataset. The core of our algorithm consists of a new iterative method that can be seen as a simplified form of the fixed Hessian method, but with a much lower multiplicative complexity.

RESULTS

We test the new method on two interesting real life applications: the first application is in medicine and constructs a model to predict the probability for a patient to have cancer, given genomic data as input; the second application is in finance and the model predicts the probability of a credit card transaction to be fraudulent. The method produces accurate results for both applications, comparable to running standard algorithms on plaintext data.

CONCLUSIONS

This article introduces a new simple iterative algorithm to train a logistic regression model that is tailored to be applied on a homomorphically encrypted dataset. This algorithm can be used as a privacy-preserving technique to build a binary classification model and can be applied in a wide range of problems that can be modelled with logistic regression. Our implementation results show that our method can handle the large datasets used in logistic regression training.

摘要

背景

逻辑回归是机器学习中用于构建分类模型的一种流行技术。由于此类模型的构建是基于对大数据集的计算，因此将计算外包给云服务是一个很有吸引力的想法。输入数据的隐私敏感性要求在将其外包之前采取适当的隐私保护措施。同态加密允许直接对加密数据进行计算，而无需解密，可以用于减轻使用云服务引起的隐私问题。

方法

在本文中，我们提出了一种在同态加密数据集上训练逻辑回归模型的算法（及其实现）。我们算法的核心是一种新的迭代方法，可以看作是固定 Hessian 方法的简化形式，但乘法复杂度要低得多。

结果

我们在两个有趣的实际应用中测试了新方法：第一个应用是医学，构建了一个模型，用于根据基因组数据预测患者患癌症的概率；第二个应用是金融，该模型预测信用卡交易欺诈的概率。该方法在这两个应用中都产生了准确的结果，与在明文数据上运行标准算法相当。

结论

本文介绍了一种新的简单迭代算法，用于训练适用于同态加密数据集的逻辑回归模型。该算法可作为一种隐私保护技术，用于构建二进制分类模型，并可应用于可通过逻辑回归建模的广泛问题。我们的实现结果表明，我们的方法可以处理逻辑回归训练中使用的大型数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c9b/6180357/600d595c836d/12920_2018_398_Fig1_HTML.jpg

相似文献

Privacy-preserving logistic regression training.隐私保护的逻辑回归训练。

BMC Med Genomics. 2018 Oct 11;11(Suppl 4):86. doi: 10.1186/s12920-018-0398-y.

Logistic regression over encrypted data from fully homomorphic encryption.基于全同态加密的密文数据的逻辑回归。

BMC Med Genomics. 2018 Oct 11;11(Suppl 4):81. doi: 10.1186/s12920-018-0397-z.

Privacy-preserving approximate GWAS computation based on homomorphic encryption.基于同态加密的隐私保护近似 GWAS 计算。

BMC Med Genomics. 2020 Jul 21;13(Suppl 7):77. doi: 10.1186/s12920-020-0722-1.

Logistic regression model training based on the approximate homomorphic encryption.基于近似同态加密的逻辑回归模型训练。

BMC Med Genomics. 2018 Oct 11;11(Suppl 4):83. doi: 10.1186/s12920-018-0401-7.

Private queries on encrypted genomic data.关于加密基因组数据的私密查询

BMC Med Genomics. 2017 Jul 26;10(Suppl 2):45. doi: 10.1186/s12920-017-0276-z.

Privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption.使用全同态加密进行隐私保护的半并行逻辑回归训练。

BMC Med Genomics. 2020 Jul 21;13(Suppl 7):88. doi: 10.1186/s12920-020-0723-0.

Privacy-preserving model evaluation for logistic and linear regression using homomorphically encrypted genotype data.基于同态加密基因型数据的逻辑回归和线性回归的隐私保护模型评估。

J Biomed Inform. 2024 Aug;156:104678. doi: 10.1016/j.jbi.2024.104678. Epub 2024 Jun 25.

Private predictive analysis on encrypted medical data.对加密医疗数据的隐私预测分析。

J Biomed Inform. 2014 Aug;50:234-43. doi: 10.1016/j.jbi.2014.04.003. Epub 2014 May 14.

Preserving Health Care Data Security and Privacy Using Carmichael's Theorem-Based Homomorphic Encryption and Modified Enhanced Homomorphic Encryption Schemes in Edge Computing Systems.利用基于 Carmichael 定理的同态加密和改进的增强同态加密方案在边缘计算系统中保护医疗保健数据的安全性和隐私性。

Big Data. 2022 Feb;10(1):1-17. doi: 10.1089/big.2021.0012. Epub 2021 Aug 10.

Secure Logistic Regression Based on Homomorphic Encryption: Design and Evaluation.基于同态加密的安全逻辑回归：设计与评估

JMIR Med Inform. 2018 Apr 17;6(2):e19. doi: 10.2196/medinform.8805.

引用本文的文献

Multiple Strategies Confirm the Anti Hepatocellular Carcinoma Effect of Cinnamic Acid Based on the PI3k-AKT Pathway.多种策略证实肉桂酸基于PI3k-AKT通路的抗肝细胞癌作用。

Pharmaceuticals (Basel). 2025 Aug 14;18(8):1205. doi: 10.3390/ph18081205.

Classification of Maxillofacial Morphology by Artificial Intelligence Using Cephalometric Analysis Measurements.利用头影测量分析数据通过人工智能进行颌面形态分类

Diagnostics (Basel). 2023 Jun 21;13(13):2134. doi: 10.3390/diagnostics13132134.

The evolving privacy and security concerns for genomic data analysis and sharing as observed from the iDASH competition.从 iDASH 竞赛中观察到的基因组数据分析和共享的不断发展的隐私和安全问题。

J Am Med Inform Assoc. 2022 Nov 14;29(12):2182-2190. doi: 10.1093/jamia/ocac165.

Privacy-preserving federated neural network learning for disease-associated cell classification.用于疾病相关细胞分类的隐私保护联邦神经网络学习

Patterns (N Y). 2022 Apr 18;3(5):100487. doi: 10.1016/j.patter.2022.100487. eCollection 2022 May 13.

Secure tumor classification by shallow neural network using homomorphic encryption.利用同态加密实现浅层神经网络的肿瘤分类安全。

BMC Genomics. 2022 Apr 9;23(1):284. doi: 10.1186/s12864-022-08469-w.

A Potential Three-Gene-Based Diagnostic Signature for Hypertension in Pregnancy.一种潜在的基于三个基因的妊娠期高血压诊断标志物。

Int J Gen Med. 2021 Oct 15;14:6847-6856. doi: 10.2147/IJGM.S331573. eCollection 2021.

Machine learning analysis of gene expression profile reveals a novel diagnostic signature for osteoporosis.机器学习分析基因表达谱揭示骨质疏松症的新型诊断特征。

J Orthop Surg Res. 2021 Mar 15;16(1):189. doi: 10.1186/s13018-021-02329-1.

Females and Males Show Differences in Early-Stage Transcriptomic Biomarkers of Lung Adenocarcinoma and Lung Squamous Cell Carcinoma.女性和男性在肺腺癌和肺鳞状细胞癌的早期转录组生物标志物上存在差异。

Diagnostics (Basel). 2021 Feb 19;11(2):347. doi: 10.3390/diagnostics11020347.

High performance logistic regression for privacy-preserving genome analysis.用于隐私保护基因组分析的高性能逻辑回归。

BMC Med Genomics. 2021 Jan 20;14(1):23. doi: 10.1186/s12920-020-00869-9.

Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study.基于 Web 的隐私保护多方医学数据分析工具：通过门限同态加密实现：设计与开发研究。

J Med Internet Res. 2020 Dec 8;22(12):e22555. doi: 10.2196/22555.

本文引用的文献

Secure Logistic Regression Based on Homomorphic Encryption: Design and Evaluation.基于同态加密的安全逻辑回归：设计与评估

JMIR Med Inform. 2018 Apr 17;6(2):e19. doi: 10.2196/medinform.8805.

Private predictive analysis on encrypted medical data.对加密医疗数据的隐私预测分析。

J Biomed Inform. 2014 Aug;50:234-43. doi: 10.1016/j.jbi.2014.04.003. Epub 2014 May 14.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

隐私保护的逻辑回归训练。

Privacy-preserving logistic regression training.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献