Hoepp Robin, Rist Leonhard, Katzmann Alexander, Ashok Raghavan, Wimmer Andreas, Sühling Michael, Maier Andreas
Computed Tomography, Siemens Healthineers, Forchheim, Germany.
Pattern Recognition Lab, FAU Erlangen-Nürnberg, Erlangen, Germany.
Int J Comput Assist Radiol Surg. 2025 Sep 2. doi: 10.1007/s11548-025-03504-z.
Federated Learning helps training deep learning networks with diverse data from different locations, particularly in restricted clinical settings. However, label distributions overlapping only partially across clients, due to different demographics, may significantly harm the global training, and thus local model performance. Investigating such effects before rolling out large-scale Federated Learning setups requires proper sampling of the expected label distributions.
We present a sampling algorithm to build data subsets according to desired mean and standard deviations from an initial global distribution. To this end, we incorporate the chi-squared and Gini impurity measures to numerically optimize label distributions for multiple groups in an efficient fashion.
Using a real-world application scenario, we sample train and test groups according to region-specific distributions for 3D camera-based weight and height estimation in a clinical context, comparing a hard data split serving as a baseline with our proposed sampling technique. We train a baseline model on all data for comparison and use Federated Averaging to combine the training of our data subsets, demonstrating a realistic deterioration of 25.3 % on weight and 28.7 % on height estimations by the global model.
Realistically client-biased label distribution can notably harm the training in a federated context. Our sampling algorithm for simulating realistic data distributions opens up an efficient way for prior analysis of this effect. The technique is agnostic to the chosen network architecture and target scenario and can be adapted to any feature or label problem with non-IID subpopulations.
联邦学习有助于利用来自不同地点的多样化数据训练深度学习网络,特别是在受限的临床环境中。然而,由于不同的人口统计学特征,客户端之间的标签分布仅部分重叠,这可能会严重损害全局训练,进而影响局部模型性能。在大规模推出联邦学习设置之前,研究这种影响需要对预期的标签分布进行适当采样。
我们提出一种采样算法,根据初始全局分布的期望均值和标准差构建数据子集。为此,我们纳入卡方和基尼杂质度量,以高效地对多组标签分布进行数值优化。
在一个实际应用场景中,我们根据特定区域分布对训练组和测试组进行采样,用于临床环境中基于3D摄像头的体重和身高估计,将作为基线的硬数据划分与我们提出的采样技术进行比较。我们在所有数据上训练一个基线模型用于比较,并使用联邦平均法来合并我们数据子集的训练,结果表明全局模型在体重估计上实际下降了25.3%,在身高估计上下降了28.7%。
实际中客户端有偏差的标签分布会在联邦环境中显著损害训练。我们用于模拟实际数据分布的采样算法为事先分析这种影响开辟了一条有效途径。该技术与所选的网络架构和目标场景无关,可适用于任何具有非独立同分布子群体的特征或标签问题。