IEEE Trans Pattern Anal Mach Intell. 2016 Jan;38(1):88-101. doi: 10.1109/TPAMI.2015.2420563.
Human parsing, namely partitioning the human body into semantic regions, has drawn much attention recently for its wide applications in human-centric analysis. Previous works often consider solving the problem of human pose estimation as the prerequisite of human parsing. We argue that these approaches cannot obtain optimal pixel-level parsing due to the inconsistent targets between the different tasks. In this work, we directly address the problem of human parsing by using the novel Parselet representation as the building blocks of our parsing model. Parselets are a group of parsable segments which can generally be obtained by low-level over-segmentation algorithms and bear strong semantic meaning. We then build a deformable mixture parsing model (DMPM) for human parsing to simultaneously handle the deformation and multi-modalities of Parselets. The proposed model has two unique characteristics: (1) the possible numerous modalities of Parselet ensembles are exhibited as the "And-Or" structure of sub-trees; (2) to further solve the practical problem of Parselet occlusion or absence, we directly model the visibility property at some leaf nodes. The DMPM thus directly solves the problem of human parsing by searching for the best graph configuration from a pool of Parselet hypotheses without intermediate tasks. Fast rejection based on hierarchical filtering is employed to ensure the overall efficiency. Comprehensive evaluations on a new large-scale human parsing dataset, which is crawled from the Internet, with high resolution and thoroughly annotated semantic labels at pixel-level, and also a benchmark dataset demonstrate the encouraging performance of the proposed approach.
人体解析,即将人体划分为语义区域,因其在以人为中心的分析中的广泛应用而受到广泛关注。以前的工作通常认为解决人体姿态估计问题是人体解析的前提。我们认为,由于不同任务之间的目标不一致,这些方法无法获得最佳的像素级解析。在这项工作中,我们直接通过使用新的 Parselet 表示作为我们的解析模型的构建块来解决人体解析问题。Parselet 是一组可解析的段,通常可以通过低级的过分割算法获得,并且具有很强的语义意义。然后,我们为人体解析构建了一个可变形混合解析模型 (DMPM),以同时处理 Parselet 的变形和多模态。所提出的模型具有两个独特的特点:(1)Parselet 集合的可能的众多模态被展示为子树的“与或”结构;(2)为了进一步解决 Parselet 遮挡或缺失的实际问题,我们直接在一些叶子节点上建模可见性属性。DMPM 因此通过从一组 Parselet 假设中搜索最佳图形配置来直接解决人体解析问题,而无需中间任务。基于分层过滤的快速拒绝用于确保整体效率。在一个新的大规模人体解析数据集上进行了全面评估,该数据集是从互联网上抓取的,具有高分辨率和彻底注释的像素级语义标签,以及一个基准数据集,展示了所提出方法的令人鼓舞的性能。