Li Wenqi, Milletarì Fausto, Xu Daguang, Rieke Nicola, Hancox Jonny, Zhu Wentao, Baust Maximilian, Cheng Yan, Ourselin Sébastien, Cardoso M Jorge, Feng Andrew
NVIDIA.
Biomedical Engineering and Imaging Sciences, King's College London, UK.
Mach Learn Med Imaging. 2019;11861:133-141. doi: 10.1007/978-3-030-32692-0_16. Epub 2019 Oct 10.
Due to medical data privacy regulations, it is often infeasible to collect and share patient data in a centralised data lake. This poses challenges for training machine learning algorithms, such as deep convolutional networks, which often require large numbers of diverse training examples. Federated learning sidesteps this difficulty by bringing code to the patient data owners and only sharing intermediate model training updates among them. Although a high-accuracy model could be achieved by appropriately aggregating these model updates, the model shared could indirectly leak the local training examples. In this paper, we investigate the feasibility of applying differential-privacy techniques to protect the patient data in a federated learning setup. We implement and evaluate practical federated learning systems for brain tumour segmentation on the BraTS dataset. The experimental results show that there is a trade-off between model performance and privacy protection costs.
由于医学数据隐私法规,在集中式数据湖中收集和共享患者数据通常是不可行的。这给训练机器学习算法带来了挑战,比如深度卷积网络,这类算法通常需要大量多样的训练示例。联邦学习通过将代码带给患者数据所有者,并仅在他们之间共享中间模型训练更新,从而避开了这个难题。尽管通过适当地聚合这些模型更新可以实现高精度模型,但共享的模型可能会间接泄露本地训练示例。在本文中,我们研究了在联邦学习设置中应用差分隐私技术来保护患者数据的可行性。我们在BraTS数据集上实现并评估了用于脑肿瘤分割的实用联邦学习系统。实验结果表明,在模型性能和隐私保护成本之间存在权衡。