Cheng Kaiming, Shen Yueyang, Dinov Ivo D
Statistics Online Computational Resource, University of Michigan, 426 North Ingalls Str, Ann Arbor, Michigan 48109-2003.
J Stat Theory Pract. 2024 Sep;18(3). doi: 10.1007/s42519-024-00384-5. Epub 2024 Jun 17.
In this paper, we propose a novel deep neural network (DNN) architecture with fractal structure and attention blocks. The new method is tested to identify and segment 2D and 3D brain tumor masks in normal and pathological neuroimaging data. To circumvent the problem of limited 3D volumetric datasets with raw and ground truth tumor masks, we utilized data augmentation using affine transformations to significantly expand the training data prior to estimating the network model parameters. The proposed technique combines benefits of fractal convolutional networks, attention blocks, and the encoder-decoder structure of Unet. The AFUnet models are fit on training data and their performance is assessed on independent validation and testing datasets. The Dice score is used to measure and contrast the performance of AFUnet against alternative methods, such as Unet, attention Unet, and several other DNN models with relative number of parameters. In addition, we explore the effects of the network depth to the AFUnet prediction accuracy. The results suggest that with a few network structure iterations, the attention-based fractal Unet achieves good performance. Although deeper nested network structure certainly improves the prediction accuracy, this comes with a very substantial computational cost. The benefits of fitting deeper AFUnet models are relative to the extra time and computational demands. Some of the AFUnet networks outperform current state-of-the-art models and achieve highly accurate and realistic brain-tumor boundary segmentation (contours in 2D and surfaces in 3D). In our experiments, the sensitivity of the Dice score to capture significant inter-models differences is marginal. However, there is improved validation loss during long periods of AFUnet training. The lower binary cross entropy loss suggests that AFUNet is superior in finding true negative voxels (i.e., identifying normal tissue), which suggests the new method is more conservative. This approach may be generalized to higher dimensional data, e.g., 4D fMRI hypervolumes, and applied for a wide range of signal, image, volume, and hypervolume segmentation tasks.
在本文中,我们提出了一种具有分形结构和注意力模块的新型深度神经网络(DNN)架构。该新方法经过测试,用于在正常和病理神经影像数据中识别和分割二维和三维脑肿瘤掩码。为了解决具有原始和真实肿瘤掩码的三维体积数据集有限的问题,我们在估计网络模型参数之前,利用仿射变换进行数据增强,以显著扩展训练数据。所提出的技术结合了分形卷积网络、注意力模块以及Unet的编码器 - 解码器结构的优点。AFUnet模型在训练数据上进行拟合,并在独立的验证和测试数据集上评估其性能。Dice分数用于衡量和对比AFUnet与其他替代方法(如Unet、注意力Unet以及其他具有相对参数数量的DNN模型)的性能。此外,我们探讨了网络深度对AFUnet预测准确性的影响。结果表明,经过几次网络结构迭代,基于注意力的分形Unet取得了良好的性能。虽然更深的嵌套网络结构肯定会提高预测准确性,但这伴随着非常高的计算成本。拟合更深的AFUnet模型的好处与额外的时间和计算需求相关。一些AFUnet网络优于当前的最先进模型,并实现了高度准确和逼真的脑肿瘤边界分割(二维中的轮廓和三维中的表面)。在我们的实验中,Dice分数捕捉模型间显著差异的敏感性很小。然而,在AFUnet长时间训练期间,验证损失有所改善。较低的二元交叉熵损失表明AFUNet在找到真阴性体素(即识别正常组织)方面更具优势,这表明新方法更为保守。这种方法可能会推广到更高维度的数据,例如四维功能磁共振成像超体积,并应用于广泛的信号、图像、体积和超体积分割任务。