Çetin Gizem S, Chen Hao, Laine Kim, Lauter Kristin, Rindal Peter, Xia Yuhou
Worcester Polytechnic Institute, 100 Institute Rd, Worcester, MA 01609, USA.
Microsoft Research, 14820 NE 36th St, Redmond, WA 98052, USA.
BMC Med Genomics. 2017 Jul 26;10(Suppl 2):45. doi: 10.1186/s12920-017-0276-z.
One of the tasks in the iDASH Secure Genome Analysis Competition in 2016 was to demonstrate the feasibility of privacy-preserving queries on homomorphically encrypted genomic data. More precisely, given a list of up to 100,000 mutations, the task was to encrypt the data using homomorphic encryption in a way that allows it to be stored securely in the cloud, and enables the data owner to query the dataset for the presence of specific mutations, without revealing any information about the dataset or the queries to the cloud.
We devise a novel string matching protocol to enable privacy-preserving queries on homomorphically encrypted data. Our protocol combines state-of-the-art techniques from homomorphic encryption and private set intersection protocols to minimize the computational and communication cost.
We implemented our protocol using the homomorphic encryption library SEAL v2.1, and applied it to obtain an efficient solution to the iDASH competition task. For example, using 8 threads, our protocol achieves a running time of only 4 s, and a communication cost of 2 MB, when querying for the presence of 5 mutations from an encrypted dataset of 100,000 mutations.
We demonstrate that homomorphic encryption can be used to enable an efficient privacy-preserving mechanism for querying the presence of particular mutations in realistic size datasets. Beyond its applications to genomics, our protocol can just as well be applied to any kind of data, and is therefore of independent interest to the homomorphic encryption community.
2016年iDASH安全基因组分析竞赛的任务之一是证明对同态加密基因组数据进行隐私保护查询的可行性。更确切地说,给定一个包含多达10万个突变的列表,任务是以一种允许将数据安全存储在云端的方式使用同态加密对数据进行加密,并使数据所有者能够查询数据集中特定突变的存在情况,而不向云端透露任何有关数据集或查询的信息。
我们设计了一种新颖的字符串匹配协议,以实现对同态加密数据的隐私保护查询。我们的协议结合了同态加密和私有集交集协议中的最新技术,以最小化计算和通信成本。
我们使用同态加密库SEAL v2.1实现了我们的协议,并将其应用于获得iDASH竞赛任务的高效解决方案。例如,使用8个线程,当从一个包含10万个突变的加密数据集中查询5个突变的存在情况时,我们的协议运行时间仅为4秒,通信成本为2MB。
我们证明了同态加密可用于为在实际规模数据集中查询特定突变的存在情况启用高效的隐私保护机制。除了其在基因组学中的应用外,我们的协议同样可应用于任何类型的数据,因此对同态加密社区具有独立的研究价值。