Institute for Infocomm Research, Agency for Science, Technology And Research (A*STAR), Singapore, Singapore.
Sci Rep. 2023 May 8;13(1):7461. doi: 10.1038/s41598-023-34535-8.
Classification of viral strains is essential in monitoring and managing the COVID-19 pandemic, but patient privacy and data security concerns often limit the extent of the open sharing of full viral genome sequencing data. We propose a framework called CoVnita, that supports private training of a classification model and secure inference with the same model. Using genomic sequences from eight common SARS-CoV-2 strains, we simulated scenarios where the data was distributed across multiple data providers. Our framework produces a private federated model, over 8 parties, with a classification AUROC of 0.99, given a privacy budget of [Formula: see text]. The roundtrip time, from encryption to decryption, took a total of 0.298 s, with an amortized time of 74.5 ms per sample.
在监测和管理 COVID-19 大流行方面,对病毒株进行分类至关重要,但患者隐私和数据安全问题常常限制了完整病毒基因组测序数据的公开共享程度。我们提出了一个名为 CoVnita 的框架,该框架支持对分类模型进行私有训练,并使用相同的模型进行安全推断。我们使用来自八种常见 SARS-CoV-2 毒株的基因组序列,模拟了数据分布在多个数据提供方的情况。在给定[公式:请参见文本]的隐私预算的情况下,我们的框架在 8 个以上的参与方之间生成了一个私有联邦模型,其分类 AUROC 为 0.99。从加密到解密的往返时间总计为 0.298 秒,每个样本的平均时间为 74.5 毫秒。