Zheng Zhou, Hayashi Yuichiro, Oda Masahiro, Kitasaka Takayuki, Misawa Kazunari, Mori Kensaku
Graduate School of Informatics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi, Japan.
Information Strategy Office, Information and Communications, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi, Japan.
Int J Comput Assist Radiol Surg. 2024 May 8. doi: 10.1007/s11548-024-03139-6.
This paper considers a new problem setting for multi-organ segmentation based on the following observations. In reality, (1) collecting a large-scale dataset from various institutes is usually impeded due to privacy issues; (2) many images are not labeled since the slice-by-slice annotation is costly; and (3) datasets may exhibit inconsistent, partial annotations across different institutes. Learning a federated model from these distributed, partially labeled, and unlabeled samples is an unexplored problem.
To simulate this multi-organ segmentation problem, several distributed clients and a central server are maintained. The central server coordinates with clients to learn a global model using distributed private datasets, which comprise a small part of partially labeled images and a large part of unlabeled images. To address this problem, a practical framework that unifies partially supervised learning (PSL), semi-supervised learning (SSL), and federated learning (FL) paradigms with PSL, SSL, and FL modules is proposed. The PSL module manages to learn from partially labeled samples. The SSL module extracts valuable information from unlabeled data. Besides, the FL module aggregates local information from distributed clients to generate a global statistical model. With the collaboration of three modules, the presented scheme could take advantage of these distributed imperfect datasets to train a generalizable model.
The proposed method was extensively evaluated with multiple abdominal CT datasets, achieving an average result of 84.83% in Dice and 41.62 mm in 95HD for multi-organ (liver, spleen, and stomach) segmentation. Moreover, its efficacy in transfer learning further demonstrated its good generalization ability for downstream segmentation tasks.
This study considers a novel problem of multi-organ segmentation, which aims to develop a generalizable model using distributed, partially labeled, and unlabeled CT images. A practical framework is presented, which, through extensive validation, has proved to be an effective solution, demonstrating strong potential in addressing this challenging problem.
本文基于以下观察结果考虑多器官分割的一种新问题设置。在现实中,(1)由于隐私问题,从各个机构收集大规模数据集通常会受到阻碍;(2)许多图像未被标注,因为逐片标注成本高昂;(3)不同机构的数据集可能存在不一致的、部分标注的情况。从这些分布式的、部分标注的和未标注的样本中学习联邦模型是一个尚未探索的问题。
为了模拟这个多器官分割问题,维护了几个分布式客户端和一个中央服务器。中央服务器与客户端协作,使用分布式私有数据集学习全局模型,这些数据集包括一小部分部分标注的图像和一大部分未标注的图像。为了解决这个问题,提出了一个实用框架,该框架将部分监督学习(PSL)、半监督学习(SSL)和联邦学习(FL)范式与PSL、SSL和FL模块统一起来。PSL模块设法从部分标注的样本中学习。SSL模块从未标注的数据中提取有价值的信息。此外,FL模块聚合来自分布式客户端的本地信息以生成全局统计模型。通过三个模块的协作,所提出的方案可以利用这些分布式的不完美数据集来训练一个可推广的模型。
所提出的方法在多个腹部CT数据集上进行了广泛评估,在多器官(肝脏、脾脏和胃)分割中,Dice平均结果为84.83%,95HD为41.62毫米。此外,其在迁移学习中的有效性进一步证明了其对下游分割任务具有良好的泛化能力。
本研究考虑了多器官分割的一个新问题,旨在使用分布式的、部分标注的和未标注的CT图像开发一个可推广的模型。提出了一个实用框架,通过广泛验证,已证明是一个有效的解决方案,在解决这个具有挑战性的问题方面显示出强大的潜力。