Julius Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
Julius Clinical, Zeist, The Netherlands.
BMC Infect Dis. 2024 Nov 21;24(1):1327. doi: 10.1186/s12879-024-10230-5.
The use of real-world data has become increasingly popular, also in the field of infectious disease (ID), particularly since the COVID-19 pandemic emerged. While much useful data for research is being collected, these data are generally stored across different sources. Privacy concerns limit the possibility to store the data centrally, thereby also limiting the possibility of fully leveraging the potential power of combined data. Federated learning (FL) has been suggested to overcome privacy issues by making it possible to perform research on data from various sources without those data leaving local servers. In this review, we discuss existing applications of FL in ID research, as well as the most relevant opportunities and challenges of this method.
References for this review were identified through searches of MEDLINE/PubMed, Google Scholar, Embase and Scopus until July 2023. We searched for studies using FL in different applications related to ID.
Thirty references were included and divided into four sub-topics: disease screening, prediction of clinical outcomes, infection epidemiology, and vaccine research. Most research was related to COVID-19. In all studies, FL achieved good accuracy when predicting diseases and outcomes, also in comparison to non-federated methods. However, most studies did not make use of real-world federated data, but rather showed the potential of FL by using data that was manually partitioned.
FL is a promising methodology which allows using data from several sources, potentially generating stronger and more generalisable results. However, further exploration of FL application possibilities in ID research is needed.
真实世界数据的使用在传染病(ID)领域变得越来越流行,尤其是自 COVID-19 大流行出现以来。虽然正在收集许多用于研究的有用数据,但这些数据通常存储在不同的来源中。隐私问题限制了将数据集中存储的可能性,从而也限制了充分利用合并后数据的潜力。联邦学习(FL)通过允许在不将数据离开本地服务器的情况下,从各种来源的数据上进行研究,从而解决了隐私问题。在本综述中,我们讨论了 FL 在 ID 研究中的现有应用,以及该方法的最相关机会和挑战。
通过对 MEDLINE/PubMed、Google Scholar、Embase 和 Scopus 进行搜索,确定了本综述的参考文献,直到 2023 年 7 月。我们搜索了使用 FL 应用于 ID 相关不同应用的研究。
共纳入 30 篇参考文献,分为四个子主题:疾病筛查、临床结局预测、传染病流行病学和疫苗研究。大多数研究都与 COVID-19 有关。在所有研究中,FL 在预测疾病和结果方面都达到了很好的准确性,与非联邦方法相比也是如此。然而,大多数研究并没有使用真实世界的联邦数据,而是通过使用手动划分的数据展示了 FL 的潜力。
FL 是一种很有前途的方法,它允许使用来自多个来源的数据,从而生成更强、更具普遍性的结果。然而,还需要进一步探索 FL 在 ID 研究中的应用可能性。