Suppr超能文献

基于同态加密的隐私保护癌症类型预测。

Privacy-preserving cancer type prediction with homomorphic encryption.

机构信息

Tandon School of Engineering, New York University, Brooklyn, NY, 11201, USA.

Center for Cyber Security, New York University Abu Dhabi, Abu Dhabi, 129188, UAE.

出版信息

Sci Rep. 2023 Jan 30;13(1):1661. doi: 10.1038/s41598-023-28481-8.

Abstract

Cancer genomics tailors diagnosis and treatment based on an individual's genetic information and is the crux of precision medicine. However, analysis and maintenance of high volume of genetic mutation data to build a machine learning (ML) model to predict the cancer type is a computationally expensive task and is often outsourced to powerful cloud servers, raising critical privacy concerns for patients' data. Homomorphic encryption (HE) enables computation on encrypted data, thus, providing cryptographic guarantees to protect privacy. But restrictive overheads of encrypted computation deter its usage. In this work, we explore the challenges of privacy preserving cancer type prediction using a dataset consisting of more than 2 million genetic mutations from 2713 patients for several cancer types by building a highly accurate ML model and then implementing its privacy preserving version in HE. Our solution for cancer type inference encodes somatic mutations based on their impact on the cancer genomes into the feature space and then uses statistical tests for feature selection. We propose a fast matrix multiplication algorithm for HE-based model. Our final model achieves 0.98 micro-average area under curve improving accuracy from 70.08 to 83.61% , being 550 times faster than the standard matrix multiplication-based privacy-preserving models. Our tool can be found at https://github.com/momalab/octal-candet .

摘要

癌症基因组学根据个体的遗传信息来量身定制诊断和治疗方案,是精准医疗的核心。然而,分析和维护大量的基因突变数据以构建机器学习 (ML) 模型来预测癌症类型是一项计算成本很高的任务,通常外包给功能强大的云服务器,这引发了患者数据的重大隐私问题。同态加密 (HE) 可以对加密数据进行计算,从而为保护隐私提供密码学保证。但是,加密计算的限制开销阻碍了其使用。在这项工作中,我们通过构建一个高度准确的 ML 模型来探索使用包含来自 2713 名患者的超过 200 万种基因突变的数据集进行隐私保护的癌症类型预测的挑战,然后在 HE 中实现其隐私保护版本。我们用于癌症类型推断的解决方案基于它们对癌症基因组的影响将体细胞突变编码到特征空间中,然后使用统计检验进行特征选择。我们为基于 HE 的模型提出了一种快速矩阵乘法算法。我们的最终模型在微平均 AUC 上达到 0.98,将准确率从 70.08%提高到 83.61%,比基于标准矩阵乘法的隐私保护模型快 550 倍。我们的工具可以在 https://github.com/momalab/octal-candet 找到。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44e2/9886900/4d64876ccd3a/41598_2023_28481_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验