Chen Chaoqi, Li Jiongcheng, Zhou Hong-Yu, Han Xiaoguang, Huang Yue, Ding Xinghao, Yu Yizhou
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3677-3694. doi: 10.1109/TPAMI.2022.3179445. Epub 2023 Feb 3.
Domain Adaptive Object Detection (DAOD) focuses on improving the generalization ability of object detectors via knowledge transfer. Recent advances in DAOD strive to change the emphasis of the adaptation process from global to local in virtue of fine-grained feature alignment methods. However, both the global and local alignment approaches fail to capture the topological relations among different foreground objects as the explicit dependencies and interactions between and within domains are neglected. In this case, only seeking one-vs-one alignment does not necessarily ensure the precise knowledge transfer. Moreover, conventional alignment-based approaches may be vulnerable to catastrophic overfitting regarding those less transferable regions (e.g., backgrounds) due to the accumulation of inaccurate localization results in the target domain. To remedy these issues, we first formulate DAOD as an open-set domain adaptation problem, in which the foregrounds and backgrounds are seen as the "known classes" and "unknown class" respectively. Accordingly, we propose a new and general framework for DAOD, named Foreground-aware Graph-based Relational Reasoning (FGRR), which incorporates graph structures into the detection pipeline to explicitly model the intra- and inter-domain foreground object relations on both pixel and semantic spaces, thereby endowing the DAOD model with the capability of relational reasoning beyond the popular alignment-based paradigm. FGRR first identifies the foreground pixels and regions by searching reliable correspondence and cross-domain similarity regularization respectively. The inter-domain visual and semantic correlations are hierarchically modeled via bipartite graph structures, and the intra-domain relations are encoded via graph attention mechanisms. Through message-passing, each node aggregates semantic and contextual information from the same and opposite domain to substantially enhance its expressive power. Empirical results demonstrate that the proposed FGRR exceeds the state-of-the-art performance on four DAOD benchmarks.
域自适应目标检测(DAOD)专注于通过知识迁移来提高目标检测器的泛化能力。DAOD的最新进展致力于借助细粒度特征对齐方法,将适应过程的重点从全局转向局部。然而,全局和局部对齐方法都未能捕捉到不同前景对象之间的拓扑关系,因为域间和域内的显式依赖关系和相互作用被忽略了。在这种情况下,仅寻求一对一的对齐并不一定能确保精确的知识迁移。此外,由于目标域中不准确的定位结果积累,传统的基于对齐的方法在那些可迁移性较差的区域(例如背景)可能容易受到灾难性过拟合的影响。为了解决这些问题,我们首先将DAOD表述为一个开放集域适应问题,其中前景和背景分别被视为“已知类”和“未知类”。相应地,我们提出了一种新的通用DAOD框架,称为基于前景感知图的关系推理(FGRR),它将图结构纳入检测管道,以在像素和语义空间上明确建模域内和域间的前景对象关系,从而赋予DAOD模型超越流行的基于对齐范式的关系推理能力。FGRR首先分别通过搜索可靠对应和跨域相似性正则化来识别前景像素和区域。域间视觉和语义相关性通过二分图结构进行分层建模,域内关系通过图注意力机制进行编码。通过消息传递,每个节点聚合来自相同和相反域的语义和上下文信息,以大幅增强其表达能力。实证结果表明,所提出的FGRR在四个DAOD基准测试中超过了当前的最佳性能。