Choi Geunho, Cha Won Chul, Lee Se Uk, Shin Soo-Yong
Department of Digital Health, SAIHST, Sungkyunkwan University, Seoul, Korea.
Department of Emergency Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea.
Healthc Inform Res. 2024 Jan;30(1):3-15. doi: 10.4258/hir.2024.30.1.3. Epub 2024 Jan 31.
Medical artificial intelligence (AI) has recently attracted considerable attention. However, training medical AI models is challenging due to privacy-protection regulations. Among the proposed solutions, federated learning (FL) stands out. FL involves transmitting only model parameters without sharing the original data, making it particularly suitable for the medical field, where data privacy is paramount. This study reviews the application of FL in the medical domain.
We conducted a literature search using the keywords "federated learning" in combination with "medical," "healthcare," or "clinical" on Google Scholar and PubMed. After reviewing titles and abstracts, 58 papers were selected for analysis. These FL studies were categorized based on the types of data used, the target disease, the use of open datasets, the local model of FL, and the neural network model. We also examined issues related to heterogeneity and security.
In the investigated FL studies, the most commonly used data type was image data, and the most studied target diseases were cancer and COVID-19. The majority of studies utilized open datasets. Furthermore, 72% of the FL articles addressed heterogeneity issues, while 50% discussed security concerns.
FL in the medical domain appears to be in its early stages, with most research using open data and focusing on specific data types and diseases for performance verification purposes. Nonetheless, medical FL research is anticipated to be increasingly applied and to become a vital component of multi-institutional research.
医学人工智能(AI)最近引起了广泛关注。然而,由于隐私保护法规,训练医学AI模型具有挑战性。在提出的解决方案中,联邦学习(FL)脱颖而出。联邦学习只传输模型参数而不共享原始数据,这使其特别适用于数据隐私至关重要的医学领域。本研究回顾了联邦学习在医学领域的应用。
我们在谷歌学术和PubMed上使用关键词“联邦学习”与“医学”、“医疗保健”或“临床”相结合进行文献检索。在审阅标题和摘要后,选择了58篇论文进行分析。这些联邦学习研究根据所使用的数据类型、目标疾病、开放数据集的使用、联邦学习的本地模型以及神经网络模型进行分类。我们还研究了与异质性和安全性相关的问题。
在所调查的联邦学习研究中,最常用的数据类型是图像数据,研究最多的目标疾病是癌症和新冠肺炎。大多数研究使用了开放数据集。此外,72%的联邦学习文章讨论了异质性问题,而50%讨论了安全问题。
医学领域的联邦学习似乎尚处于早期阶段,大多数研究使用开放数据,并专注于特定的数据类型和疾病以进行性能验证。尽管如此,医学联邦学习研究预计将得到越来越多的应用,并成为多机构研究的重要组成部分。