利用全球二进制掩模进行医学图像中的结构分割。
Leveraging global binary masks for structure segmentation in medical images.
机构信息
Department of Radiation Oncology, the University of Texas Southwestern Medical Center, Dallas TX 75390 United States of America.
Department of Radiation Oncology, Stanford University, Stanford, CA 94305, United States of America.
出版信息
Phys Med Biol. 2023 Sep 13;68(18). doi: 10.1088/1361-6560/acf2e2.
Deep learning (DL) models for medical image segmentation are highly influenced by intensity variations of input images and lack generalization due to primarily utilizing pixels' intensity information for inference. Acquiring sufficient training data is another challenge limiting models' applications. Here, we proposed to leverage the consistency of organs' anatomical position and shape information in medical images. We introduced a framework leveraging recurring anatomical patterns through global binary masks for organ segmentation. Two scenarios were studied: (1) global binary masks were the only input for the U-Net based model, forcing exclusively encoding organs' position and shape information for rough segmentation or localization. (2) Global binary masks were incorporated as an additional channel providing position/shape clues to mitigate training data scarcity. Two datasets of the brain and heart computed tomography (CT) images with their ground-truth were split into (26:10:10) and (12:3:5) for training, validation, and test respectively. The two scenarios were evaluated using full training split as well as reduced subsets of training data. In scenario (1), training exclusively on global binary masks led to Dice scores of 0.77 ± 0.06 and 0.85 ± 0.04 for the brain and heart structures respectively. Average Euclidian distance of 3.12 ± 1.43 mm and 2.5 ± 0.93 mm were obtained relative to the center of mass of the ground truth for the brain and heart structures respectively. The outcomes indicated encoding a surprising degree of position and shape information through global binary masks. In scenario (2), incorporating global binary masks led to significantly higher accuracy relative to the model trained on only CT images in small subsets of training data; the performance improved by 4.3%-125.3% and 1.3%-48.1% for 1-8 training cases of the brain and heart datasets respectively. The findings imply the advantages of utilizing global binary masks for building models that are robust to image intensity variations as well as an effective approach to boost performance when access to labeled training data is highly limited.
深度学习(DL)模型在医学图像分割中受到输入图像强度变化的高度影响,并且由于主要利用像素的强度信息进行推断,因此缺乏泛化能力。获取足够的训练数据是限制模型应用的另一个挑战。在这里,我们提出利用医学图像中器官解剖位置和形状信息的一致性。我们引入了一个通过全局二进制掩码为器官分割利用重复出现的解剖模式的框架。研究了两种情况:(1)全局二进制掩码是基于 U-Net 的模型的唯一输入,强制仅对器官的位置和形状信息进行编码,以进行粗略分割或定位。(2)全局二进制掩码被合并为一个附加通道,提供位置/形状线索,以缓解训练数据稀缺的问题。两个脑和心脏 CT 图像数据集及其地面实况被分为(26:10:10)和(12:3:5)用于训练、验证和测试。使用完整的训练数据集和训练数据的减少子集评估了两种情况。在情况(1)中,仅在全局二进制掩码上进行训练导致大脑和心脏结构的 Dice 得分分别为 0.77 ± 0.06 和 0.85 ± 0.04。相对于地面实况的质心,大脑和心脏结构的平均欧几里得距离分别为 3.12 ± 1.43 毫米和 2.5 ± 0.93 毫米。情况(2)中,与仅在 CT 图像上训练的模型相比,在训练数据的较小子集上,包含全局二进制掩码可显著提高准确性;大脑和心脏数据集的训练案例分别为 1-8 个,性能提高了 4.3%-125.3%和 1.3%-48.1%。研究结果表明,利用全局二进制掩码构建对图像强度变化具有鲁棒性的模型具有优势,并且在访问标记训练数据非常有限的情况下,是一种提高性能的有效方法。