Colombo Aurora A F, Colombo Luca, Falcetta Alessandro, Roveri Manuel
Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milano, Italy,
Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milano, Italy.
Pac Symp Biocomput. 2025;30:565-579. doi: 10.1142/9789819807024_0040.
Precision medicine significantly enhances patients prognosis, offering personalized treatments. Particularly for metastatic cancer, incorporating primary tumor location into the diagnostic process greatly improves survival rates. However, traditional methods rely on human expertise, requiring substantial time and financial resources. To address this challenge, Machine Learning (ML) and Deep Learning (DL) have proven particularly effective. Yet, their application to medical data, especially genomic data, must consider and encompass privacy due to the highly sensitive nature of data. In this paper, we propose OGHE, a convolutional neural network-based approach for privacy-preserving cancer classification designed to exploit spatial patterns in genomic data, while maintaining confidentiality by means of Homomorphic Encryption (HE). This encryption scheme allows the processing directly on encrypted data, guaranteeing its confidentiality during the entire computation. The design of OGHE is specific for privacy-preserving applications, taking into account HE limitations from the outset, and introducing an efficient packing mechanism to minimize the computational overhead introduced by HE. Additionally, OGHE relies on a novel feature selection method, VarScout, designed to extract the most significant features through clustering and occurrence analysis, while preserving inherent spatial patterns. Coupled with VarScout, OGHE has been compared with existing privacy-preserving solutions for encrypted cancer classification on the iDash 2020 dataset, demonstrating their effectiveness in providing accurate privacy-preserving cancer classification, and reducing latency thanks to our packing mechanism. The code is released to the scientific community.
精准医学显著改善患者预后,提供个性化治疗。特别是对于转移性癌症,将原发性肿瘤位置纳入诊断过程可大大提高生存率。然而,传统方法依赖人类专业知识,需要大量时间和财政资源。为应对这一挑战,机器学习(ML)和深度学习(DL)已证明特别有效。然而,由于数据的高度敏感性,它们在医学数据尤其是基因组数据中的应用必须考虑并涵盖隐私问题。在本文中,我们提出了OGHE,一种基于卷积神经网络的隐私保护癌症分类方法,旨在利用基因组数据中的空间模式,同时通过同态加密(HE)保持数据机密性。这种加密方案允许直接对加密数据进行处理,在整个计算过程中保证其机密性。OGHE的设计针对隐私保护应用,从一开始就考虑到HE的局限性,并引入了一种高效的打包机制,以最小化HE引入的计算开销。此外,OGHE依赖于一种新颖的特征选择方法VarScout,该方法旨在通过聚类和出现分析提取最重要的特征,同时保留固有的空间模式。与VarScout相结合,OGHE已在iDash 2020数据集上与现有的加密癌症分类隐私保护解决方案进行了比较,证明了它们在提供准确的隐私保护癌症分类方面的有效性,并且由于我们的打包机制而减少了延迟。代码已向科学界发布。