• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于联邦学习中逼真分布模拟的强大采样技术。

A robust sampling technique for realistic distribution simulation in federated learning.

作者信息

Hoepp Robin, Rist Leonhard, Katzmann Alexander, Ashok Raghavan, Wimmer Andreas, Sühling Michael, Maier Andreas

机构信息

Computed Tomography, Siemens Healthineers, Forchheim, Germany.

Pattern Recognition Lab, FAU Erlangen-Nürnberg, Erlangen, Germany.

出版信息

Int J Comput Assist Radiol Surg. 2025 Sep 2. doi: 10.1007/s11548-025-03504-z.

DOI:10.1007/s11548-025-03504-z
PMID:40892192
Abstract

PURPOSE

Federated Learning helps training deep learning networks with diverse data from different locations, particularly in restricted clinical settings. However, label distributions overlapping only partially across clients, due to different demographics, may significantly harm the global training, and thus local model performance. Investigating such effects before rolling out large-scale Federated Learning setups requires proper sampling of the expected label distributions.

METHODS

We present a sampling algorithm to build data subsets according to desired mean and standard deviations from an initial global distribution. To this end, we incorporate the chi-squared and Gini impurity measures to numerically optimize label distributions for multiple groups in an efficient fashion.

RESULTS

Using a real-world application scenario, we sample train and test groups according to region-specific distributions for 3D camera-based weight and height estimation in a clinical context, comparing a hard data split serving as a baseline with our proposed sampling technique. We train a baseline model on all data for comparison and use Federated Averaging to combine the training of our data subsets, demonstrating a realistic deterioration of 25.3 % on weight and 28.7 % on height estimations by the global model.

CONCLUSIONS

Realistically client-biased label distribution can notably harm the training in a federated context. Our sampling algorithm for simulating realistic data distributions opens up an efficient way for prior analysis of this effect. The technique is agnostic to the chosen network architecture and target scenario and can be adapted to any feature or label problem with non-IID subpopulations.

摘要

目的

联邦学习有助于利用来自不同地点的多样化数据训练深度学习网络,特别是在受限的临床环境中。然而,由于不同的人口统计学特征,客户端之间的标签分布仅部分重叠,这可能会严重损害全局训练,进而影响局部模型性能。在大规模推出联邦学习设置之前,研究这种影响需要对预期的标签分布进行适当采样。

方法

我们提出一种采样算法,根据初始全局分布的期望均值和标准差构建数据子集。为此,我们纳入卡方和基尼杂质度量,以高效地对多组标签分布进行数值优化。

结果

在一个实际应用场景中,我们根据特定区域分布对训练组和测试组进行采样,用于临床环境中基于3D摄像头的体重和身高估计,将作为基线的硬数据划分与我们提出的采样技术进行比较。我们在所有数据上训练一个基线模型用于比较,并使用联邦平均法来合并我们数据子集的训练,结果表明全局模型在体重估计上实际下降了25.3%,在身高估计上下降了28.7%。

结论

实际中客户端有偏差的标签分布会在联邦环境中显著损害训练。我们用于模拟实际数据分布的采样算法为事先分析这种影响开辟了一条有效途径。该技术与所选的网络架构和目标场景无关,可适用于任何具有非独立同分布子群体的特征或标签问题。

相似文献

1
A robust sampling technique for realistic distribution simulation in federated learning.一种用于联邦学习中逼真分布模拟的强大采样技术。
Int J Comput Assist Radiol Surg. 2025 Sep 2. doi: 10.1007/s11548-025-03504-z.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Interventions to reduce harm from continued tobacco use.减少持续吸烟危害的干预措施。
Cochrane Database Syst Rev. 2016 Oct 13;10(10):CD005231. doi: 10.1002/14651858.CD005231.pub3.
4
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
5
Sexual Harassment and Prevention Training性骚扰与预防培训
6
Comparison of self-administered survey questionnaire responses collected using mobile apps versus other methods.使用移动应用程序与其他方法收集的自我管理调查问卷回复的比较。
Cochrane Database Syst Rev. 2015 Jul 27;2015(7):MR000042. doi: 10.1002/14651858.MR000042.pub2.
7
Healthcare workers' informal uses of mobile phones and other mobile devices to support their work: a qualitative evidence synthesis.医护人员非正规使用手机和其他移动设备来支持工作:定性证据综合评价。
Cochrane Database Syst Rev. 2024 Aug 27;8(8):CD015705. doi: 10.1002/14651858.CD015705.pub2.
8
The effect of sample site and collection procedure on identification of SARS-CoV-2 infection.样本采集部位和采集程序对严重急性呼吸综合征冠状病毒2(SARS-CoV-2)感染鉴定的影响。
Cochrane Database Syst Rev. 2024 Dec 16;12(12):CD014780. doi: 10.1002/14651858.CD014780.
9
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
10
ASAS-NANP symposium: mathematical modeling in animal nutrition: synthetic database generation for non-normal multivariate distributions: a rank-based method with application to ruminant methane emissions.美国动物科学学会-北美猪营养大会研讨会:动物营养中的数学建模:非正态多元分布的综合数据库生成:一种基于秩的方法及其在反刍动物甲烷排放中的应用
J Anim Sci. 2025 Jan 4;103. doi: 10.1093/jas/skaf136.

本文引用的文献

1
Automated Patient Registration in Magnetic Resonance Imaging Using Deep Learning-Based Height and Weight Estimation with 3D Camera: A Feasibility Study.基于深度学习的 3D 相机身高体重估算的磁共振成像自动化患者登记:一项可行性研究。
Acad Radiol. 2024 Jul;31(7):2715-2724. doi: 10.1016/j.acra.2024.01.029. Epub 2024 Feb 16.
2
Brain Tumor Detection Based on Deep Learning Approaches and Magnetic Resonance Imaging.基于深度学习方法和磁共振成像的脑肿瘤检测
Cancers (Basel). 2023 Aug 18;15(16):4172. doi: 10.3390/cancers15164172.
3
nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation.
nnU-Net:一种基于深度学习的生物医学图像分割的自配置方法。
Nat Methods. 2021 Feb;18(2):203-211. doi: 10.1038/s41592-020-01008-z. Epub 2020 Dec 7.
4
Overcoming catastrophic forgetting in neural networks.克服神经网络中的灾难性遗忘。
Proc Natl Acad Sci U S A. 2017 Mar 28;114(13):3521-3526. doi: 10.1073/pnas.1611835114. Epub 2017 Mar 14.