基于 FCN 投票方法的三维 CT 图像节段外观的深度学习用于解剖结构分割。

Deep learning of the sectional appearances of 3D CT images for anatomical structure segmentation based on an FCN voting method.

机构信息

Department of Intelligent Image Information, Graduate School of Medicine, Gifu University, Gifu, 501-1194, Japan.

Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, 29208, USA.

出版信息

Med Phys. 2017 Oct;44(10):5221-5233. doi: 10.1002/mp.12480. Epub 2017 Aug 31.

DOI:10.1002/mp.12480

PMID:28730602

Abstract

PURPOSE

We propose a single network trained by pixel-to-label deep learning to address the general issue of automatic multiple organ segmentation in three-dimensional (3D) computed tomography (CT) images. Our method can be described as a voxel-wise multiple-class classification scheme for automatically assigning labels to each pixel/voxel in a 2D/3D CT image.

METHODS

We simplify the segmentation algorithms of anatomical structures (including multiple organs) in a CT image (generally in 3D) to a majority voting scheme over the semantic segmentation of multiple 2D slices drawn from different viewpoints with redundancy. The proposed method inherits the spirit of fully convolutional networks (FCNs) that consist of "convolution" and "deconvolution" layers for 2D semantic image segmentation, and expands the core structure with 3D-2D-3D transformations to adapt to 3D CT image segmentation. All parameters in the proposed network are trained pixel-to-label from a small number of CT cases with human annotations as the ground truth. The proposed network naturally fulfills the requirements of multiple organ segmentations in CT cases of different sizes that cover arbitrary scan regions without any adjustment.

RESULTS

The proposed network was trained and validated using the simultaneous segmentation of 19 anatomical structures in the human torso, including 17 major organs and two special regions (lumen and content inside of stomach). Some of these structures have never been reported in previous research on CT segmentation. A database consisting of 240 (95% for training and 5% for testing) 3D CT scans, together with their manually annotated ground-truth segmentations, was used in our experiments. The results show that the 19 structures of interest were segmented with acceptable accuracy (88.1% and 87.9% voxels in the training and testing datasets, respectively, were labeled correctly) against the ground truth.

CONCLUSIONS

We propose a single network based on pixel-to-label deep learning to address the challenging issue of anatomical structure segmentation in 3D CT cases. The novelty of this work is the policy of deep learning of the different 2D sectional appearances of 3D anatomical structures for CT cases and the majority voting of the 3D segmentation results from multiple crossed 2D sections to achieve availability and reliability with better efficiency, generality, and flexibility than conventional segmentation methods, which must be guided by human expertise.

摘要

目的

我们提出了一种基于像素到标签的深度学习的单一网络，以解决三维（3D）计算机断层扫描（CT）图像中自动多器官分割的一般问题。我们的方法可以描述为一种体素分类方案，用于自动为 2D/3D CT 图像中的每个像素/体素分配标签。

方法

我们将 CT 图像（通常为 3D）中解剖结构（包括多个器官）的分割算法简化为从不同视点绘制的多个 2D 切片的语义分割的多数投票方案，具有冗余性。所提出的方法继承了全卷积网络（FCN）的精神，由用于 2D 语义图像分割的“卷积”和“反卷积”层组成，并通过 3D-2D-3D 变换扩展核心结构，以适应 3D CT 图像分割。所提出的网络的所有参数都是从少数具有人类注释作为真实数据的 CT 病例中进行像素到标签的训练。该网络自然满足了不同大小的 CT 病例中多器官分割的要求，涵盖了任意扫描区域，无需任何调整。

结果

所提出的网络使用人体躯干的 19 个解剖结构的同步分割进行训练和验证，包括 17 个主要器官和两个特殊区域（内腔和胃内内容物）。其中一些结构在以前的 CT 分割研究中从未报道过。我们的实验使用了由 240 个（95%用于训练，5%用于测试）3D CT 扫描组成的数据库，以及它们的手动注释地面真实分割。结果表明，19 个感兴趣的结构的分割具有可接受的准确性（训练数据集和测试数据集的正确标记体素分别为 88.1%和 87.9%）。

结论

我们提出了一种基于像素到标签的深度学习的单一网络，以解决 3D CT 病例中解剖结构分割的挑战性问题。这项工作的新颖之处在于对 3D 解剖结构的不同 2D 截面外观进行深度学习的策略，以及对来自多个交叉 2D 截面的 3D 分割结果进行多数投票，以实现可用性和可靠性，与传统分割方法相比，效率更高、通用性更强、灵活性更高，而传统分割方法必须由人类专业知识指导。