Suppr超能文献

HySec-Flow:基于SGX的大数据分析框架实现隐私保护的基因组计算

HySec-Flow: Privacy-Preserving Genomic Computing with SGX-based Big-Data Analytics Framework.

作者信息

Widanage Chathura, Liu Weijie, Li Jiayu, Chen Hongbo, Wang XiaoFeng, Tang Haixu, Fox Judy

机构信息

Indiana University.

University of Virginia.

出版信息

IEEE Int Conf Cloud Comput. 2021 Sep;2021:733-743. doi: 10.1109/CLOUD53861.2021.00098. Epub 2021 Nov 13.

Abstract

Trusted execution environments (TEE) such as Intel's Software Guard Extension (SGX) have been widely studied to boost security and privacy protection for the computation of sensitive data such as human genomics. However, a performance hurdle is often generated by SGX, especially from the small enclave memory. In this paper, we propose a new Hybrid Secured Flow framework (called "HySec-Flow") for large-scale genomic data analysis using SGX platforms. Here, the data-intensive computing tasks can be partitioned into independent subtasks to be deployed into distinct secured and non-secured containers, therefore allowing for parallel execution while alleviating the limited size of Page Cache (EPC) memory in each enclave. We illustrate our contributions using a workflow supporting indexing, alignment, dispatching, and merging the execution of SGX- enabled containers. We provide details regarding the architecture of the trusted and untrusted components and the underlying Scorn and Graphene support as generic shielding execution frameworks to port legacy code. We thoroughly evaluate the performance of our privacy-preserving reads mapping algorithm using real human genome sequencing data. The results demonstrate that the performance is enhanced by partitioning the time-consuming genomic computation into subtasks compared to the conventional execution of the data-intensive reads mapping algorithm in an enclave. The proposed HySec-Flow framework is made available as an open-source and adapted to the data-parallel computation of other large-scale genomic tasks requiring security and scalable computational resources.

摘要

诸如英特尔软件防护扩展(SGX)之类的可信执行环境(TEE)已得到广泛研究,以增强对人类基因组学等敏感数据计算的安全性和隐私保护。然而,SGX常常会带来性能障碍,尤其是来自小的飞地内存。在本文中,我们提出了一种新的混合安全流框架(称为“HySec-Flow”),用于使用SGX平台进行大规模基因组数据分析。在这里,数据密集型计算任务可以被划分为独立的子任务,以部署到不同的安全和非安全容器中,从而允许并行执行,同时缓解每个飞地中页面缓存(EPC)内存有限的问题。我们使用一个支持索引、比对、调度和合并启用SGX的容器执行的工作流程来说明我们的贡献。我们提供了关于可信和不可信组件的架构以及底层Scorn和Graphene支持的详细信息,作为移植遗留代码的通用屏蔽执行框架。我们使用真实的人类基因组测序数据全面评估了我们的隐私保护读段比对算法的性能。结果表明,与在飞地中传统执行数据密集型读段比对算法相比,将耗时的基因组计算划分为子任务可提高性能。所提出的HySec-Flow框架作为开源框架提供,并适用于其他需要安全和可扩展计算资源的大规模基因组任务的数据并行计算。

相似文献

2
Practical and Efficient in-Enclave Verification of Privacy Compliance.实用且高效的飞地隐私合规性验证
Proc (Int Conf Dependable Syst Netw). 2021 Jun;2021:413-425. doi: 10.1109/dsn48987.2021.00052. Epub 2021 Aug 6.
3
Privacy-preserving genotype imputation in a trusted execution environment.在可信执行环境中进行隐私保护的基因型推断。
Cell Syst. 2021 Oct 20;12(10):983-993.e7. doi: 10.1016/j.cels.2021.08.001. Epub 2021 Aug 26.

本文引用的文献

8
DIDA: Distributed Indexing Dispatched Alignment.DIDA:分布式索引调度对齐
PLoS One. 2015 Apr 29;10(4):e0126409. doi: 10.1371/journal.pone.0126409. eCollection 2015.
9
Identifying personal genomes by surname inference.姓氏推断识别个人基因组。
Science. 2013 Jan 18;339(6117):321-4. doi: 10.1126/science.1229566.
10
Fast gapped-read alignment with Bowtie 2.快速缺口读对准与 Bowtie 2。
Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验