Faculty of Information and Communication Technology, Mahidol University, Nakhon Pathom, 73170, Thailand.
Division of Plastic and Reconstructive Surgery, Department of Surgery, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, 10700, Thailand.
Sci Rep. 2024 Aug 24;14(1):19671. doi: 10.1038/s41598-024-70826-4.
The automatic segmentation of the pharyngeal airway space has many potential medical use, one of which is to help facilitate the creation of the Tubingen Palatal Plate. Therefore, it is of great importance to understand which methods are suitable for this task. Here, neural network based solutions available in the literature are compared to find the best methods. Neural network models were chosen to encompass a diverse landscape. Some models were taken from the general semantic segmentation literature, while others were taken from the medical or pharyngeal airway space segmentation literature. Some models are convolutional neural networks, while others are transformer-based model or a mix of both convolutional and transformer-based model. These models include 2d/3d U-Net, Deeplabv3, YOLOv8, Swinv2 UNETR, SegFormer, and 3D MRU-Net. Furthermore, additional strategies to enhance performance were also considered. These strategies consisted of training two separate networks in multiple stages as well leveraging unlabeled data to pretrain the neural networks before fine-tuning them on the labeled data. It was found that out of all the models considered here, the 2d U-Net performed the best achieving an average dice score of 0.9180 ± 0.0111. Out of all the strategies to enhance performance, only two strategies improve the actual results but only by a small margin. Therefore, these strategies can be consider if a small increase in performance is desired from the 2d U-Net at the expense of computational resource.
咽气道空间的自动分割具有许多潜在的医学用途,其中之一是帮助创建图宾根腭板。因此,了解哪些方法适合这项任务非常重要。在这里,比较了文献中基于神经网络的解决方案,以找到最佳方法。选择神经网络模型来涵盖多样化的领域。一些模型来自一般语义分割文献,而另一些则来自医学或咽气道空间分割文献。一些模型是卷积神经网络,而另一些是基于转换器的模型或卷积和转换器的混合模型。这些模型包括 2d/3d U-Net、Deeplabv3、YOLOv8、Swinv2 UNETR、SegFormer 和 3D MRU-Net。此外,还考虑了提高性能的其他策略。这些策略包括在多个阶段训练两个单独的网络,以及利用未标记的数据在对标记数据进行微调之前对神经网络进行预训练。结果发现,在所考虑的所有模型中,2d U-Net 的性能最佳,平均骰子分数为 0.9180 ± 0.0111。在所有提高性能的策略中,只有两种策略可以提高实际结果,但只是略有提高。因此,如果希望 2d U-Net 的性能略有提高,可以考虑这些策略,但要牺牲计算资源。