Suppr超能文献

迈向医学预测模型的实用联邦学习与评估

Towards practical federated learning and evaluation for medical prediction models.

作者信息

Kazlouski Andrei, Montoya Perez Ileana, Noor Faiza, Högerman Mikael, Ettala Otto, Pahikkala Tapio, Airola Antti

机构信息

Department of Computing, University of Turku, Turku, Finland.

Department of Computing, University of Turku, Turku, Finland.

出版信息

Int J Med Inform. 2025 Dec;204:106046. doi: 10.1016/j.ijmedinf.2025.106046. Epub 2025 Jul 18.

Abstract

BACKGROUND

Federated learning (FL) is a rapidly advancing technique that enables collaborative model training while preserving data privacy. This approach is particularly relevant in healthcare, where privacy concerns and regulatory restrictions often prevent centralized data sharing. FL has shown promise in tasks such as disease detection, achieving performance levels comparable to centralized systems. However, its practical usability in real-world applications remains underexplored.

METHODS

We evaluate the practical effectiveness of FL in predicting whether patients suspected of prostate cancer require invasive biopsy procedures. The study uses 14 publicly available prostate cancer datasets from 10 countries. We propose and benchmark a novel FL evaluation strategy, Leave-Silo-Out (LSO), which quantifies the performance gap between federated training and free-riding (utilizing the federated model without contributing data). Additionally, we investigate whether locally trained models can outperform multi-hospital FL models. The results are assessed with a focus on improving the diagnosis of local patients.

RESULTS

Our findings reveal that the benefits of FL vary with the amount of locally available annotated data. Hospitals with very small datasets see negligible improvements from FL compared to free-riding. Institutions with moderate datasets may achieve some gains through FL training. However, hospitals with extensive datasets often experience little to no advantage from FL and, in some cases, observe reduced performance compared to local training.

CONCLUSION

Federated learning shows potential in scenarios with limited data availability. However, its practical applicability is highly context-dependent, influenced by factors such as data availability and specific task requirements.

摘要

背景

联邦学习(FL)是一种快速发展的技术,它能够在保护数据隐私的同时进行协作式模型训练。这种方法在医疗保健领域尤为相关,因为隐私问题和监管限制常常阻碍集中式数据共享。联邦学习在疾病检测等任务中已显示出前景,其性能水平可与集中式系统相媲美。然而,它在实际应用中的实际可用性仍未得到充分探索。

方法

我们评估联邦学习在预测疑似前列腺癌患者是否需要进行侵入性活检程序方面的实际效果。该研究使用了来自10个国家的14个公开可用的前列腺癌数据集。我们提出并对一种新颖的联邦学习评估策略——留筒外法(LSO)进行基准测试,该策略量化了联邦训练与搭便车(利用联邦模型但不贡献数据)之间的性能差距。此外,我们研究本地训练的模型是否能优于多医院联邦学习模型。评估结果时重点关注改善对本地患者的诊断。

结果

我们的研究结果表明,联邦学习的益处因本地可用标注数据的数量而异。与搭便车相比,数据集非常小的医院从联邦学习中获得的改善微不足道。数据集适中的机构可能通过联邦学习训练实现一些收益。然而,拥有大量数据集的医院通常从联邦学习中获得的优势很小甚至没有,在某些情况下,与本地训练相比性能还会下降。

结论

联邦学习在数据可用性有限的场景中显示出潜力。然而,其实际适用性高度依赖于具体情境,受到数据可用性和特定任务要求等因素的影响。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验