Kim Miran, Song Yongsoo, Cheon Jung Hee
Division of Biomedical Informatics, University of California- San Diego, San Diego, CA, 92093, USA.
Department of Mathematical Sciences, Seoul National University, GwanAkRo 1, Seoul, 08826, Republic of Korea.
BMC Med Genomics. 2017 Jul 26;10(Suppl 2):42. doi: 10.1186/s12920-017-0280-3.
As genome sequencing technology develops rapidly, there has lately been an increasing need to keep genomic data secure even when stored in the cloud and still used for research. We are interested in designing a protocol for the secure outsourcing matching problem on encrypted data.
We propose an efficient method to securely search a matching position with the query data and extract some information at the position. After decryption, only a small amount of comparisons with the query information should be performed in plaintext state. We apply this method to find a set of biomarkers in encrypted genomes. The important feature of our method is to encode a genomic database as a single element of polynomial ring.
Since our method requires a single homomorphic multiplication of hybrid scheme for query computation, it has the advantage over the previous methods in parameter size, computation complexity, and communication cost. In particular, the extraction procedure not only prevents leakage of database information that has not been queried by user but also reduces the communication cost by half. We evaluate the performance of our method and verify that the computation on large-scale personal data can be securely and practically outsourced to a cloud environment during data analysis. It takes about 3.9 s to search-and-extract the reference and alternate sequences at the queried position in a database of size 4M.
Our solution for finding a set of biomarkers in DNA sequences shows the progress of cryptographic techniques in terms of their capability can support real-world genome data analysis in a cloud environment.
随着基因组测序技术的迅速发展,即使基因组数据存储在云端且仍用于研究,对其进行安全保护的需求也日益增加。我们致力于设计一种针对加密数据的安全外包匹配问题的协议。
我们提出了一种高效方法,用于在加密数据中安全地搜索与查询数据匹配的位置,并在该位置提取一些信息。解密后,仅需在明文状态下与查询信息进行少量比较。我们将此方法应用于在加密基因组中寻找一组生物标志物。我们方法的重要特征是将基因组数据库编码为多项式环的单个元素。
由于我们的方法在查询计算中仅需一次混合方案的同态乘法,因此在参数大小、计算复杂度和通信成本方面优于先前方法。特别是,提取过程不仅可防止未被用户查询的数据库信息泄露,还能将通信成本减半。我们评估了我们方法的性能,并验证了在数据分析期间,大规模个人数据的计算可以安全且实际地外包给云环境。在一个大小为4M的数据库中,在查询位置搜索并提取参考序列和替代序列大约需要3.9秒。
我们在DNA序列中寻找一组生物标志物的解决方案表明,密码技术在支持云环境下实际基因组数据分析的能力方面取得了进展。