Kim Soopil, Park Heejung, Chikontwe Philip, Kang Myeongkyun, Hwan Jin Kyong, Adeli Ehsan, Pohl Kilian M, Hyun Park Sang
IEEE Trans Med Imaging. 2025 May;44(5):2079-2092. doi: 10.1109/TMI.2025.3525581. Epub 2025 May 2.
Federated learning (FL) methods for multi-organ segmentation in CT scans are gaining popularity, but generally require numerous rounds of parameter exchange between a central server and clients. This repetitive sharing of parameters between server and clients may not be practical due to the varying network infrastructures of clients and the large transmission of data. Further increasing repetitive sharing results from data heterogeneity among clients, i.e., clients may differ with respect to the type of data they share. For example, they might provide label maps of different organs (i.e. partial labels) as segmentations of all organs shown in the CT are not part of their clinical protocol. To this end, we propose an efficient communication approach for FL with partial labels. Specifically, parameters of local models are transmitted once to a central server and the global model is trained via knowledge distillation (KD) of the local models. While one can make use of unlabeled public data as inputs for KD, the model accuracy is often limited due to distribution shifts between local and public datasets. Herein, we propose to generate synthetic images from clients' models as additional inputs to mitigate data shifts between public and local data. In addition, our proposed method offers flexibility for additional finetuning through several rounds of communication using existing FL algorithms, leading to enhanced performance. Extensive evaluation on public datasets in few communication FL scenario reveals that our approach substantially improves over state-of-the-art methods.
用于CT扫描中多器官分割的联邦学习(FL)方法越来越受欢迎,但通常需要中央服务器和客户端之间进行多轮参数交换。由于客户端网络基础设施的差异以及大量的数据传输,服务器和客户端之间这种重复的参数共享可能并不实际。客户端之间的数据异质性进一步加剧了重复共享,即客户端共享的数据类型可能不同。例如,他们可能会提供不同器官的标签图(即部分标签),因为CT中显示的所有器官的分割并非其临床方案的一部分。为此,我们提出了一种针对带有部分标签的联邦学习的高效通信方法。具体而言,局部模型的参数一次性传输到中央服务器,然后通过局部模型的知识蒸馏(KD)来训练全局模型。虽然可以将未标记的公共数据用作KD的输入,但由于局部数据集和公共数据集之间的分布差异,模型准确性往往受到限制。在此,我们建议从客户端模型生成合成图像作为额外输入,以减轻公共数据和局部数据之间的数据差异。此外,我们提出的方法通过使用现有的联邦学习算法进行几轮通信,为额外的微调提供了灵活性,从而提高了性能。在少轮通信联邦学习场景下对公共数据集进行的广泛评估表明,我们的方法比现有方法有显著改进。