使用集成方法实现医疗联邦学习中更高效的数据评估

Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling.

作者信息

Kumar Sourav, Lakshminarayanan A, Chang Ken, Guretno Feri, Mien Ivan Ho, Kalpathy-Cramer Jayashree, Krishnaswamy Pavitra, Singh Praveer

机构信息

Department of Radiology, Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Boston, MA, USA.

Institute for Infocomm Research, ASTAR, Singapore.

出版信息

Distrib Collab Fed Learn Afford AI Healthc Resour Div Glob Health (2022). 2022 Sep;13573:119-129. doi: 10.1007/978-3-031-18523-6_12. Epub 2022 Oct 7.

DOI:10.1007/978-3-031-18523-6_12

PMID:36745141

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9890952/

Abstract

Federated Learning (FL) wherein multiple institutions collaboratively train a machine learning model without sharing data is becoming popular. Participating institutions might not contribute equally - some contribute more data, some better quality data or some more diverse data. To fairly rank the contribution of different institutions, Shapley value (SV) has emerged as the method of choice. Exact SV computation is impossibly expensive, especially when there are hundreds of contributors. Existing SV computation techniques use approximations. However, in healthcare where the number of contributing institutions are likely not of a colossal scale, computing exact SVs is still exorbitantly expensive, but not impossible. For such settings, we propose an efficient SV computation technique called SaFE (Shapley Value for Federated Learning using Ensembling). We empirically show that SaFE computes values that are close to exact SVs, and that it performs better than current SV approximations. This is particularly relevant in medical imaging setting where widespread heterogeneity across institutions is rampant and fast accurate data valuation is required to determine the contribution of each participant in multi-institutional collaborative learning.

摘要

联邦学习（FL），即多个机构在不共享数据的情况下协作训练机器学习模型，正变得越来越流行。参与的机构贡献可能并不均等——有些贡献更多的数据，有些贡献质量更高的数据，或者有些贡献更多样化的数据。为了公平地对不同机构的贡献进行排名，沙普利值（SV）已成为首选方法。精确计算沙普利值成本高得令人望而却步，尤其是当有数百个贡献者时。现有的沙普利值计算技术使用近似值。然而，在医疗保健领域，贡献机构的数量可能不会达到巨大规模，计算精确的沙普利值仍然极其昂贵，但并非不可能。对于这种情况，我们提出了一种高效的沙普利值计算技术，称为SaFE（使用集成的联邦学习沙普利值）。我们通过实验表明，SaFE计算出的值接近精确的沙普利值，并且其性能优于当前的沙普利值近似方法。这在医学成像环境中尤为重要，因为各机构之间普遍存在异质性，并且需要快速准确的数据评估来确定每个参与者在多机构协作学习中的贡献。

相似文献

Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling.使用集成方法实现医疗联邦学习中更高效的数据评估

Distrib Collab Fed Learn Afford AI Healthc Resour Div Glob Health (2022). 2022 Sep;13573:119-129. doi: 10.1007/978-3-031-18523-6_12. Epub 2022 Oct 7.

Unified fair federated learning for digital healthcare.用于数字医疗保健的统一公平联邦学习

Patterns (N Y). 2023 Dec 28;5(1):100907. doi: 10.1016/j.patter.2023.100907. eCollection 2024 Jan 12.

Multi-party collaborative drug discovery via federated learning.多方协作的药物发现：通过联邦学习。

Comput Biol Med. 2024 Mar;171:108181. doi: 10.1016/j.compbiomed.2024.108181. Epub 2024 Feb 19.

In the Pursuit of Privacy: The Promises and Predicaments of Federated Learning in Healthcare.追求隐私：医疗保健领域联邦学习的前景与困境

Front Artif Intell. 2021 Oct 6;4:746497. doi: 10.3389/frai.2021.746497. eCollection 2021.

Decentralized federated learning through proxy model sharing.通过代理模型共享的去中心化联邦学习。

Nat Commun. 2023 May 22;14(1):2899. doi: 10.1038/s41467-023-38569-4.

AFEI: adaptive optimized vertical federated learning for heterogeneous multi-omics data integration.AFEI：用于异构多组学数据集成的自适应优化垂直联邦学习。

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad269.

Boosted federated learning based on improved Particle Swarm Optimization for healthcare IoT devices.基于改进粒子群优化算法的联邦学习在医疗保健物联网设备中的应用。

Comput Biol Med. 2023 Sep;163:107195. doi: 10.1016/j.compbiomed.2023.107195. Epub 2023 Jun 22.

A Fair Contribution Measurement Method for Federated Learning.一种用于联邦学习的公平贡献度量方法。

Sensors (Basel). 2024 Jul 31;24(15):4967. doi: 10.3390/s24154967.

Empowering Precision Medicine: Unlocking Revolutionary Insights through Blockchain-Enabled Federated Learning and Electronic Medical Records.赋能精准医学：通过区块链赋能的联邦学习和电子病历实现革命性洞察。

Sensors (Basel). 2023 Aug 28;23(17):7476. doi: 10.3390/s23177476.

The FeatureCloud Platform for Federated Learning in Biomedicine: Unified Approach.FeatureCloud 平台在生物医学领域的联邦学习：统一方法。

J Med Internet Res. 2023 Jul 12;25:e42621. doi: 10.2196/42621.

引用本文的文献

Machine Learning-Driven Data Valuation for Optimizing High-Throughput Screening Pipelines.机器学习驱动的数据估值优化高通量筛选管道。

J Chem Inf Model. 2024 Nov 11;64(21):8142-8152. doi: 10.1021/acs.jcim.4c01547. Epub 2024 Oct 23.

Recent methodological advances in federated learning for healthcare.医疗保健领域联邦学习的最新方法进展。

Patterns (N Y). 2024 Jun 14;5(6):101006. doi: 10.1016/j.patter.2024.101006.

本文引用的文献

Federated Learning for Multicenter Collaboration in Ophthalmology: Improving Classification Performance in Retinopathy of Prematurity.多中心协作的眼科学联邦学习：提高早产儿视网膜病变的分类性能。

Ophthalmol Retina. 2022 Aug;6(8):657-663. doi: 10.1016/j.oret.2022.02.015. Epub 2022 Mar 14.

Classification of brain tumours in MR images using deep spatiospatial models.使用深度时空模型对磁共振图像中的脑肿瘤进行分类。

Sci Rep. 2022 Jan 27;12(1):1505. doi: 10.1038/s41598-022-05572-6.

Federated learning for computational pathology on gigapixel whole slide images.基于千兆像素全切片图像的计算病理学联邦学习。

Med Image Anal. 2022 Feb;76:102298. doi: 10.1016/j.media.2021.102298. Epub 2021 Nov 25.

Federated learning for predicting clinical outcomes in patients with COVID-19.基于联邦学习的 COVID-19 患者临床结局预测

Nat Med. 2021 Oct;27(10):1735-1743. doi: 10.1038/s41591-021-01506-3. Epub 2021 Sep 15.

Privacy-first health research with federated learning.采用联邦学习的隐私至上型健康研究。

NPJ Digit Med. 2021 Sep 7;4(1):132. doi: 10.1038/s41746-021-00489-2.

Data valuation for medical imaging using Shapley value and application to a large-scale chest X-ray dataset.使用 Shapley 值对医学成像数据进行估值及其在大规模胸部 X 射线数据集的应用。

Sci Rep. 2021 Apr 16;11(1):8366. doi: 10.1038/s41598-021-87762-2.

Federated Learning for Healthcare Informatics.医疗信息学中的联邦学习

J Healthc Inform Res. 2021;5(1):1-19. doi: 10.1007/s41666-020-00082-4. Epub 2020 Nov 12.

The future of digital health with federated learning.联合学习助力数字健康的未来。

NPJ Digit Med. 2020 Sep 14;3:119. doi: 10.1038/s41746-020-00323-1. eCollection 2020.

Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data.医学中的联邦学习：在不共享患者数据的情况下促进多机构合作。

Sci Rep. 2020 Jul 28;10(1):12598. doi: 10.1038/s41598-020-69250-1.

Predicting Adverse Drug Reactions on Distributed Health Data using Federated Learning.使用联邦学习预测分布式健康数据上的药物不良反应

AMIA Annu Symp Proc. 2020 Mar 4;2019:313-322. eCollection 2019.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验