Li Wuyang, Chen Zhen, Li Baopu, Zhang Dingwen, Yuan Yixuan
IEEE Trans Image Process. 2021;30:9456-9469. doi: 10.1109/TIP.2021.3126423. Epub 2021 Nov 18.
Decoupling the sibling head has recently shown great potential in relieving the inherent task-misalignment problem in two-stage object detectors. However, existing works design similar structures for the classification and regression, ignoring task-specific characteristics and feature demands. Besides, the shared knowledge that may benefit the two branches is neglected, leading to potential excessive decoupling and semantic inconsistency. To address these two issues, we propose Heterogeneous task decoupling (HTD) framework for object detection, which utilizes a Progressive Graph (PGraph) module and a Border-aware Adaptation (BA) module for task-decoupling. Specifically, we first devise a Semantic Feature Aggregation (SFA) module to aggregate global semantics with image-level supervision, serving as the shared knowledge for the task-decoupled framework. Then, the PGraph module performs progressive graph reasoning, including local spatial aggregation and global semantic interaction, to enhance semantic representations of region proposals for classification. The proposed BA module integrates multi-level features adaptively, focusing on the low-level border activation to obtain representations with spatial and border perception for regression. Finally, we utilize the aggregated knowledge from SFA to keep the instance-level semantic consistency (ISC) of decoupled frameworks. Extensive experiments demonstrate that HTD outperforms existing detection works by a large margin, and achieves single-model 50.4%AP and 33.2% AP on COCO test-dev set using ResNet-101-DCN backbone, which is the best entry among state-of-the-arts under the same configuration. Our code is available at https://github.com/CityU-AIM-Group/HTD.
最近,解耦兄弟头部在缓解两阶段目标检测器中固有的任务错位问题方面显示出巨大潜力。然而,现有工作为分类和回归设计了相似的结构,忽略了任务特定的特征和特征需求。此外,可能有益于两个分支的共享知识被忽视,导致潜在的过度解耦和语义不一致。为了解决这两个问题,我们提出了用于目标检测的异构任务解耦(HTD)框架,该框架利用一个渐进图(PGraph)模块和一个边界感知自适应(BA)模块进行任务解耦。具体来说,我们首先设计了一个语义特征聚合(SFA)模块,通过图像级监督聚合全局语义,作为任务解耦框架的共享知识。然后,PGraph模块执行渐进图推理,包括局部空间聚合和全局语义交互,以增强用于分类的区域提议的语义表示。所提出的BA模块自适应地整合多级特征,专注于低级边界激活,以获得具有空间和边界感知的回归表示。最后,我们利用来自SFA的聚合知识来保持解耦框架的实例级语义一致性(ISC)。大量实验表明,HTD在很大程度上优于现有的检测工作,并且使用ResNet-101-DCN主干在COCO测试开发集上实现了单模型50.4%的AP和33.2%的AP,这是相同配置下最先进方法中的最佳成绩。我们的代码可在https://github.com/CityU-AIM-Group/HTD获取。