Phan NhatHai, Wu Xintao, Dou Dejing
New Jersey Institute of Technology, Newark, NJ, USA.
University of Arkansas, Fayetteville, AR, USA.
Mach Learn. 2017 Oct;106(9-10):1681-1704. doi: 10.1007/s10994-017-5656-2. Epub 2017 Jul 13.
The remarkable development of deep learning in medicine and healthcare domain presents obvious privacy issues, when deep neural networks are built on users' personal and highly sensitive data, e.g., clinical records, user profiles, biomedical images, etc. However, only a few scientific studies on preserving privacy in deep learning have been conducted. In this paper, we focus on developing a private convolutional deep belief network (pCDBN), which essentially is a convolutional deep belief network (CDBN) under differential privacy. Our main idea of enforcing -differential privacy is to leverage the functional mechanism to perturb the energy-based objective functions of traditional CDBNs, rather than their results. One key contribution of this work is that we propose the use of Chebyshev expansion to derive the approximate polynomial representation of objective functions. Our theoretical analysis shows that we can further derive the sensitivity and error bounds of the approximate polynomial representation. As a result, preserving differential privacy in CDBNs is feasible. We applied our model in a health social network, i.e., YesiWell data, and in a handwriting digit dataset, i.e., MNIST data, for human behavior prediction, human behavior classification, and handwriting digit recognition tasks. Theoretical analysis and rigorous experimental evaluations show that the pCDBN is highly effective. It significantly outperforms existing solutions.
深度学习在医学和医疗领域的显著发展带来了明显的隐私问题,尤其是当深度神经网络基于用户的个人且高度敏感的数据构建时,例如临床记录、用户档案、生物医学图像等。然而,关于在深度学习中保护隐私的科学研究却很少。在本文中,我们专注于开发一种私有卷积深度信念网络(pCDBN),它本质上是一种处于差分隐私下的卷积深度信念网络(CDBN)。我们实施差分隐私的主要思路是利用函数机制来扰动传统CDBN基于能量的目标函数,而非其结果。这项工作的一个关键贡献是我们提出使用切比雪夫展开来推导目标函数的近似多项式表示。我们的理论分析表明,我们可以进一步推导近似多项式表示的敏感度和误差界。因此,在CDBN中保护差分隐私是可行的。我们将我们的模型应用于一个健康社交网络,即YesiWell数据,以及一个手写数字数据集,即MNIST数据,用于人类行为预测、人类行为分类和手写数字识别任务。理论分析和严格的实验评估表明,pCDBN非常有效。它显著优于现有解决方案。