Li Ping, Han Taiyu, Ren Yifei, Xu Peng, Yu Hongliu
Institute of Rehabilitation Engineering and Technology, University of Shanghai for Science and Technology, Shanghai, China.
Department of Biomedical Engineering, Changzhi Medical College, Changzhi, Shanxi, China.
PeerJ Comput Sci. 2023 Mar 10;9:e1288. doi: 10.7717/peerj-cs.1288. eCollection 2023.
An automatic bathing robot needs to identify the area to be bathed in order to perform visually-guided bathing tasks. Skin detection is the first step. The deep convolutional neural network (CNN)-based object detection algorithm shows excellent robustness to light and environmental changes when performing skin detection. The one-stage object detection algorithm has good real-time performance, and is widely used in practical projects.
In our previous work, we performed skin detection using Faster R-CNN (ResNet50 as backbone), Faster R-CNN (MobileNetV2 as backbone), YOLOv3 (DarkNet53 as backbone), YOLOv4 (CSPDarknet53 as backbone), and CenterNet (Hourglass as backbone), and found that YOLOv4 had the best performance. In this study, we considered the convenience of practical deployment and used the lightweight version of YOLOv4, ., YOLOv4-tiny, for skin detection. Additionally, we added three kinds of attention mechanisms to strengthen feature extraction: SE, ECA, and CBAM. We added the attention module to the two feature layers of the backbone output. In the enhanced feature extraction network part, we applied the attention module to the up-sampled features. For full comparison, we used other lightweight methods that use MobileNetV1, MobileNetV2, and MobileNetV3 as the backbone of YOLOv4. We established a comprehensive evaluation index to evaluate the performance of the models that mainly reflected the balance between model size and mAP.
The experimental results revealed that the weight file of YOLOv4-tiny without attention mechanisms was reduced to 9.2% of YOLOv4, but the mAP maintained 67.3% of YOLOv4. YOLOv4-tiny's performance improved after combining the CBAM and ECA modules, but the addition of SE deteriorated the performance of YOLOv4-tiny. MobileNetVX_YOLOv4 (X = 1, 2, 3), which used MobileNetV1, MobileNetV2, and MobileNetV3 as the backbone of YOLOv4, showed higher mAP than YOLOv4-tiny series (including YOLOv4-tiny and three improved YOLOv4-tiny based on the attention mechanism) but had a larger weight file. The network performance was evaluated using the comprehensive evaluation index. The model, which integrates the CBAM attention mechanism and YOLOv4-tiny, achieved a good balance between model size and detection accuracy.
自动沐浴机器人需要识别待沐浴区域,以便执行视觉引导的沐浴任务。皮肤检测是第一步。基于深度卷积神经网络(CNN)的目标检测算法在进行皮肤检测时,对光照和环境变化表现出出色的鲁棒性。单阶段目标检测算法具有良好的实时性能,在实际项目中得到广泛应用。
在我们之前的工作中,我们使用Faster R-CNN(以ResNet50为骨干网络)、Faster R-CNN(以MobileNetV2为骨干网络)、YOLOv3(以DarkNet53为骨干网络)、YOLOv4(以CSPDarknet53为骨干网络)和CenterNet(以Hourglass为骨干网络)进行皮肤检测,发现YOLOv4性能最佳。在本研究中,我们考虑到实际部署的便利性,使用了YOLOv4的轻量级版本,即YOLOv4-tiny进行皮肤检测。此外,我们添加了三种注意力机制来加强特征提取:SE、ECA和CBAM。我们将注意力模块添加到骨干网络输出的两个特征层。在增强特征提取网络部分,我们将注意力模块应用于上采样特征。为了进行全面比较,我们使用了其他以MobileNetV1、MobileNetV2和MobileNetV3作为YOLOv4骨干网络的轻量级方法。我们建立了一个综合评估指标来评估模型的性能,该指标主要反映模型大小和平均精度均值(mAP)之间的平衡。
实验结果表明,没有注意力机制的YOLOv4-tiny的权重文件减少到YOLOv4的9.2%,但mAP保持在YOLOv4的67.3%。结合CBAM和ECA模块后YOLOv4-tiny的性能有所提高,但添加SE会使YOLOv4-tiny的性能下降。以MobileNetV1、MobileNetV2和MobileNetV3作为YOLOv4骨干网络的MobileNetVX_YOLOv4(X = 1, 2, 3)的mAP高于YOLOv4-tiny系列(包括YOLOv4-tiny和基于注意力机制改进的三种YOLOv4-tiny),但权重文件更大