Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands.
Athena Institute, VU University Amsterdam, Amsterdam, The Netherlands.
Euro Surveill. 2024 Sep;29(38). doi: 10.2807/1560-7917.ES.2024.29.38.2300695.
BackgroundThe wide application of machine learning (ML) holds great potential to improve public health by supporting data analysis informing policy and practice. Its application, however, is often hampered by data fragmentation across organisations and strict regulation by the General Data Protection Regulation (GDPR). Federated learning (FL), as a decentralised approach to ML, has received considerable interest as a means to overcome the fragmentation of data, but it is yet unclear to which extent this approach complies with the GDPR.AimOur aim was to understand the potential data protection implications of the use of federated learning for public health purposes.MethodsBuilding upon semi-structured interviews (n = 14) and a panel discussion (n = 5) with key opinion leaders in Europe, including both FL and GDPR experts, we explored how GDPR principles would apply to the implementation of FL within public health.ResultsWhereas this study found that FL offers substantial benefits such as data minimisation, storage limitation and effective mitigation of many of the privacy risks of sharing personal data, it also identified various challenges. These challenges mostly relate to the increased difficulty of checking data at the source and the limited understanding of potential adverse outcomes of the technology.ConclusionSince FL is still in its early phase and under rapid development, it is expected that knowledge on its impracticalities will increase rapidly, potentially addressing remaining challenges. In the meantime, this study reflects on the potential of FL to align with data protection objectives and offers guidance on GDPR compliance.
背景
机器学习(ML)的广泛应用有望通过支持数据分析为政策和实践提供信息,从而改善公共卫生。然而,它的应用常常受到组织间数据碎片化和《通用数据保护条例》(GDPR)严格监管的阻碍。联邦学习(FL)作为一种 ML 的去中心化方法,作为克服数据碎片化的一种手段受到了相当大的关注,但目前尚不清楚这种方法在多大程度上符合 GDPR。
目的
我们的目的是了解使用联邦学习促进公共卫生的潜在数据保护影响。
方法
在与欧洲的主要意见领袖(包括 FL 和 GDPR 专家)进行半结构化访谈(n=14)和小组讨论(n=5)的基础上,我们探讨了 GDPR 原则如何适用于在公共卫生领域实施 FL。
结果
虽然这项研究发现,FL 提供了实质性的好处,如数据最小化、存储限制以及有效减轻共享个人数据的许多隐私风险,但它也确定了各种挑战。这些挑战主要涉及在源头上检查数据的难度增加,以及对该技术潜在不良后果的理解有限。
结论
由于 FL 仍处于早期阶段且发展迅速,预计其不切实际性的相关知识将迅速增加,从而可能解决剩余的挑战。与此同时,本研究反映了 FL 与数据保护目标保持一致的潜力,并就 GDPR 合规性提供了指导。