Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
Sensors (Basel). 2021 Jan 29;21(3):916. doi: 10.3390/s21030916.
In recent years, human detection in indoor scenes has been widely applied in smart buildings and smart security, but many related challenges can still be difficult to address, such as frequent occlusion, low illumination and multiple poses. This paper proposes an asymmetric adaptive fusion two-stream network (AAFTS-net) for RGB-D human detection. This network can fully extract person-specific depth features and RGB features while reducing the typical complexity of a two-stream network. A depth feature pyramid is constructed by combining contextual information, with the motivation of combining multiscale depth features to improve the adaptability for targets of different sizes. An adaptive channel weighting (ACW) module weights the RGB-D feature channels to achieve efficient feature selection and information complementation. This paper also introduces a novel RGB-D dataset for human detection called RGBD-human, on which we verify the performance of the proposed algorithm. The experimental results show that AAFTS-net outperforms existing state-of-the-art methods and can maintain stable performance under conditions of frequent occlusion, low illumination and multiple poses.
近年来,室内场景中的人体检测已广泛应用于智能建筑和智能安全领域,但仍有许多相关挑战难以解决,例如频繁遮挡、低光照和多种姿势。本文提出了一种用于 RGB-D 人体检测的非对称自适应融合双流网络(AAFTS-net)。该网络能够充分提取特定于人的深度特征和 RGB 特征,同时降低双流网络的典型复杂性。通过组合上下文信息构建深度特征金字塔,以结合多尺度深度特征来提高对不同大小目标的适应性。自适应通道加权(ACW)模块对 RGB-D 特征通道进行加权,以实现有效的特征选择和信息补充。本文还介绍了一个名为 RGBD-human 的新的 RGB-D 人体检测数据集,我们在该数据集上验证了所提出算法的性能。实验结果表明,AAFTS-net 优于现有的最先进方法,并且可以在频繁遮挡、低光照和多种姿势条件下保持稳定的性能。