使用多平面 U-Net 提高胸部 CT 中的运动掩模分割。

Univ Lyon, Université Claude Bernard Lyon 1, INSA-Lyon, CNRS, Inserm, CREATIS UMR 5220, U1294, F-69621, Lyon, France.

Service de Réanimation Médicale, Hôpital de la Croix Rousse, Hospices Civils de Lyon, France.

Med Phys. 2022 Jan;49(1):420-431. doi: 10.1002/mp.15347. Epub 2021 Dec 2.

PURPOSE

Motion-mask segmentation from thoracic computed tomography (CT) images is the process of extracting the region that encompasses lungs and viscera, where large displacements occur during breathing. It has been shown to help image registration between different respiratory phases. This registration step is, for example, useful for radiotherapy planning or calculating local lung ventilation. Knowing the location of motion discontinuity, that is, sliding motion near the pleura, allows a better control of the registration preventing unrealistic estimates. Nevertheless, existing methods for motion-mask segmentation are not robust enough to be used in clinical routine. This article shows that it is feasible to overcome this lack of robustness by using a lightweight deep-learning approach usable on a standard computer, and this even without data augmentation or advanced model design.

METHODS

A convolutional neural-network architecture with three 2D U-nets for the three main orientations (sagittal, coronal, axial) was proposed. Predictions generated by the three U-nets were combined by majority voting to provide a single 3D segmentation of the motion mask. The networks were trained on a database of nonsmall cell lung cancer 4D CT images of 43 patients. Training and evaluation were done with a K-fold cross-validation strategy. Evaluation was based on a visual grading by two experts according to the appropriateness of the segmented motion mask for the registration task, and on a comparison with motion masks obtained by a baseline method using level sets. A second database (76 CT images of patients with early-stage COVID-19), unseen during training, was used to assess the generalizability of the trained neural network.

RESULTS

The proposed approach outperformed the baseline method in terms of quality and robustness: the success rate increased from to without producing any failure. It also achieved a speed-up factor of 60 with GPU, or 17 with CPU. The memory footprint was low: less than 5 GB GPU RAM for training and less than 1 GB GPU RAM for inference. When evaluated on a dataset with images differing by several characteristics (CT device, pathology, and field of view), the proposed method improved the success rate from to .

CONCLUSION

With 5-s processing time on a mid-range GPU and success rates around , the proposed approach seems fast and robust enough to be routinely used in clinical practice. The success rate can be further improved by incorporating more diversity in training data via data augmentation and additional annotated images from different scanners and diseases. The code and trained model are publicly available.

目的

从胸部计算机断层扫描（CT）图像中进行运动掩模分割，是提取包含肺部和内脏的区域的过程，在此过程中，呼吸期间会发生较大的位移。它已被证明有助于不同呼吸阶段之间的图像配准。例如，该注册步骤对于放射治疗计划或计算局部肺通气很有用。了解运动不连续性的位置，即在胸膜附近的滑动运动，可更好地控制配准，防止不切实际的估计。然而，现有的运动掩模分割方法不够稳健，无法在临床常规中使用。本文表明，通过使用可在标准计算机上使用的轻量级深度学习方法，即使没有数据增强或先进的模型设计，也可以克服这种稳健性不足的问题。

方法

提出了一种具有三个 2D U-Net 的卷积神经网络架构，用于三个主要方向（矢状位、冠状位、轴位）。通过多数投票将三个 U-Net 生成的预测组合，以提供运动掩模的单个 3D 分割。网络在 43 名非小细胞肺癌 4D CT 图像的数据库上进行训练。训练和评估采用 K 折交叉验证策略。评估是根据两位专家根据分割运动掩模对配准任务的适当性进行的视觉分级以及与使用水平集获得的运动掩模的比较进行的。第二个数据库（76 名早期 COVID-19 患者的 CT 图像）在训练期间不可见，用于评估训练神经网络的泛化能力。

结果

在所提出的方法中，质量和稳健性均优于基线方法：成功率从提高到，且没有失败。它还实现了 GPU 加速因子 60，或 CPU 加速因子 17。内存占用低：训练时 GPU 少于 5GB，推理时 GPU 少于 1GB。当在具有不同特征（CT 设备、病理学和视野）的图像数据集上进行评估时，所提出的方法将成功率从提高到。