Crowson Matthew G, Moukheiber Dana, Arévalo Aldo Robles, Lam Barbara D, Mantena Sreekar, Rana Aakanksha, Goss Deborah, Bates David W, Celi Leo Anthony
Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, Massachusetts, United States of America.
Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Massachusetts, United States of America.
PLOS Digit Health. 2022 May 19;1(5):e0000033. doi: 10.1371/journal.pdig.0000033. eCollection 2022 May.
Federated learning (FL) allows multiple institutions to collaboratively develop a machine learning algorithm without sharing their data. Organizations instead share model parameters only, allowing them to benefit from a model built with a larger dataset while maintaining the privacy of their own data. We conducted a systematic review to evaluate the current state of FL in healthcare and discuss the limitations and promise of this technology.
We conducted a literature search using PRISMA guidelines. At least two reviewers assessed each study for eligibility and extracted a predetermined set of data. The quality of each study was determined using the TRIPOD guideline and PROBAST tool.
13 studies were included in the full systematic review. Most were in the field of oncology (6 of 13; 46.1%), followed by radiology (5 of 13; 38.5%). The majority evaluated imaging results, performed a binary classification prediction task via offline learning (n = 12; 92.3%), and used a centralized topology, aggregation server workflow (n = 10; 76.9%). Most studies were compliant with the major reporting requirements of the TRIPOD guidelines. In all, 6 of 13 (46.2%) of studies were judged at high risk of bias using the PROBAST tool and only 5 studies used publicly available data.
Federated learning is a growing field in machine learning with many promising uses in healthcare. Few studies have been published to date. Our evaluation found that investigators can do more to address the risk of bias and increase transparency by adding steps for data homogeneity or sharing required metadata and code.
联邦学习(FL)允许多个机构在不共享数据的情况下协作开发机器学习算法。各机构仅共享模型参数,这样它们既能受益于基于更大数据集构建的模型,又能维护自身数据的隐私性。我们进行了一项系统综述,以评估医疗保健领域中联邦学习的现状,并讨论该技术的局限性和前景。
我们按照PRISMA指南进行文献检索。至少两名评审员评估每项研究的入选资格,并提取一组预先确定的数据。使用TRIPOD指南和PROBAST工具确定每项研究的质量。
全面系统综述纳入了13项研究。大多数研究属于肿瘤学领域(13项中的6项;46.1%),其次是放射学领域(13项中的5项;38.5%)。大多数研究评估了成像结果,通过离线学习执行二元分类预测任务(n = 12;92.3%),并使用集中式拓扑、聚合服务器工作流程(n = 10;76.9%)。大多数研究符合TRIPOD指南的主要报告要求。总体而言,使用PROBAST工具判断,13项研究中有6项(46.2%)存在高偏倚风险,只有5项研究使用了公开可用的数据。
联邦学习是机器学习中一个不断发展的领域,在医疗保健领域有许多有前景的应用。迄今为止发表的研究很少。我们的评估发现,研究人员可以通过增加数据同质性步骤或共享所需的元数据和代码来更多地解决偏倚风险并提高透明度。