Faculty of Informatics, Kaunas University of Technology, 44249 Kaunas, Lithuania.
Sensors (Basel). 2022 Aug 24;22(17):6354. doi: 10.3390/s22176354.
Binary object segmentation is a sub-area of semantic segmentation that could be used for a variety of applications. Semantic segmentation models could be applied to solve binary segmentation problems by introducing only two classes, but the models to solve this problem are more complex than actually required. This leads to very long training times, since there are usually tens of millions of parameters to learn in this category of convolutional neural networks (CNNs). This article introduces a novel abridged VGG-16 and SegNet-inspired reflected architecture adapted for binary segmentation tasks. The architecture has 27 times fewer parameters than SegNet but yields 86% segmentation cross-intersection accuracy and 93% binary accuracy. The proposed architecture is evaluated on a large dataset of depth images collected using the Kinect device, achieving an accuracy of 99.25% in human body shape segmentation and 87% in gender recognition tasks.
二进制对象分割是语义分割的一个子领域,可用于各种应用。语义分割模型可以通过仅引入两个类别来应用于解决二进制分割问题,但解决此问题的模型比实际需要的更复杂。这导致训练时间非常长,因为在这类卷积神经网络 (CNN) 中通常有数千万个参数需要学习。本文介绍了一种新颖的简化 VGG-16 和受 SegNet 启发的反射架构,适用于二进制分割任务。该架构的参数比 SegNet 少 27 倍,但在分割交叉点精度上达到 86%,二进制精度达到 93%。所提出的架构在使用 Kinect 设备收集的深度图像的大型数据集上进行了评估,在人体形状分割任务中达到了 99.25%的精度,在性别识别任务中达到了 87%的精度。