College of Information Technology, Gachon University, Sengnam 13120, Korea.
Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute, Daejeon 34129, Korea.
Sensors (Basel). 2022 Sep 1;22(17):6626. doi: 10.3390/s22176626.
Pedestrians are often obstructed by other objects or people in real-world vision sensors. These obstacles make pedestrian-attribute recognition (PAR) difficult; hence, occlusion processing for visual sensing is a key issue in PAR. To address this problem, we first formulate the identification of non-occluded frames as temporal attention based on the sparsity of a crowded video. In other words, a model for PAR is guided to prevent paying attention to the occluded frame. However, we deduced that this approach cannot include a correlation between attributes when occlusion occurs. For example, "boots" and "shoe color" cannot be recognized simultaneously when the foot is invisible. To address the uncorrelated attention issue, we propose a novel temporal-attention module based on group sparsity. Group sparsity is applied across attention weights in correlated attributes. Accordingly, physically-adjacent pedestrian attributes are grouped, and the attention weights of a group are forced to focus on the same frames. Experimental results indicate that the proposed method achieved 1.18% and 6.21% higher F1-scores than the advanced baseline method on the occlusion samples in DukeMTMC-VideoReID and MARS video-based PAR datasets, respectively.
行人在现实世界的视觉传感器中经常会被其他物体或人挡住。这些障碍物使得行人属性识别 (PAR) 变得困难;因此,视觉传感器的遮挡处理是 PAR 的一个关键问题。为了解决这个问题,我们首先根据拥挤视频的稀疏性,将未遮挡帧的识别形式化为基于时间的注意力。换句话说,PAR 模型被引导以防止关注遮挡帧。然而,我们推断出,当发生遮挡时,这种方法不能包含属性之间的相关性。例如,当脚部不可见时,“靴子”和“鞋的颜色”无法同时被识别。为了解决不相关注意力的问题,我们提出了一种基于组稀疏性的新的时间注意力模块。组稀疏性应用于相关属性的注意力权重中。因此,将物理上相邻的行人属性分组,并迫使组的注意力权重集中在同一帧上。实验结果表明,在 DukeMTMC-VideoReID 和 MARS 基于视频的 PAR 数据集的遮挡样本上,与先进的基线方法相比,所提出的方法的 F1 分数分别提高了 1.18%和 6.21%。