College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China.
College of Computer and Control Engineering, Qiqihar University, Qiqihar 161006, China.
Comput Intell Neurosci. 2022 Mar 26;2022:2935975. doi: 10.1155/2022/2935975. eCollection 2022.
Support vector machine (SVM) is an efficient classification method in machine learning. The traditional classification model of SVMs may pose a great threat to personal privacy, when sensitive information is included in the training datasets. Principal component analysis (PCA) can project instances into a low-dimensional subspace while capturing the variance of the matrix as much as possible. There are two common algorithms that PCA uses to perform the principal component analysis, eigenvalue decomposition (EVD) and singular value decomposition (SVD). The main advantage of SVD compared with EVD is that it does not need to compute the matrix of covariance. This study presents a new differentially private SVD algorithm (DPSVD) to prevent the privacy leak of SVM classifiers. The DPSVD generates a set of private singular vectors that the projected instances in the singular subspace can be directly used to train SVM while not disclosing privacy of the original instances. After proving that the DPSVD satisfies differential privacy in theory, several experiments were carried out. The experimental results confirm that our method achieved higher accuracy and better stability on different real datasets, compared with other existing private PCA algorithms used to train SVM.
支持向量机(SVM)是机器学习中一种有效的分类方法。当训练数据集中包含敏感信息时,传统的 SVM 分类模型可能会对个人隐私造成极大威胁。主成分分析(PCA)可以将实例投影到低维子空间中,同时尽可能地捕获矩阵的方差。PCA 有两种常用的算法来执行主成分分析,分别是特征值分解(EVD)和奇异值分解(SVD)。与 EVD 相比,SVD 的主要优势在于它不需要计算协方差矩阵。本研究提出了一种新的差分隐私奇异值分解算法(DPSVD),以防止 SVM 分类器的隐私泄露。DPSVD 生成一组私有奇异向量,可直接使用投影到奇异子空间中的实例来训练 SVM,而不会泄露原始实例的隐私。在理论上证明 DPSVD 满足差分隐私之后,进行了几项实验。实验结果证实,与用于训练 SVM 的其他现有私有 PCA 算法相比,我们的方法在不同的真实数据集上实现了更高的准确性和更好的稳定性。