Tian Zhi, Shen Chunhua, Chen Hao, He Tong
IEEE Trans Pattern Anal Mach Intell. 2022 Apr;44(4):1922-1933. doi: 10.1109/TPAMI.2020.3032166. Epub 2022 Mar 4.
In computer vision, object detection is one of most important tasks, which underpins a few instance-level recognition tasks and many downstream applications. Recently one-stage methods have gained much attention over two-stage approaches due to their simpler design and competitive performance. Here we propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction fashion, analogue to other dense prediction problems such as semantic segmentation. Almost all state-of-the-art object detectors such as RetinaNet, SSD, YOLOv3, and Faster R-CNN rely on pre-defined anchor boxes. In contrast, our proposed detector FCOS is anchor box free, as well as proposal free. By eliminating the pre-defined set of anchor boxes, FCOS completely avoids the complicated computation related to anchor boxes such as calculating the intersection over union (IoU) scores during training. More importantly, we also avoid all hyper-parameters related to anchor boxes, which are often sensitive to the final detection performance. With the only post-processing non-maximum suppression (NMS), we demonstrate a much simpler and flexible detection framework achieving improved detection accuracy. We hope that the proposed FCOS framework can serve as a simple and strong alternative for many other instance-level tasks. Code is available at: git.io/AdelaiDet.
在计算机视觉中,目标检测是最重要的任务之一,它为一些实例级识别任务和许多下游应用奠定了基础。最近,单阶段方法因其更简单的设计和具有竞争力的性能而比双阶段方法受到更多关注。在此,我们提出一种全卷积单阶段目标检测器(FCOS),以逐像素预测的方式解决目标检测问题,类似于语义分割等其他密集预测问题。几乎所有的先进目标检测器,如RetinaNet、SSD、YOLOv3和Faster R-CNN,都依赖于预定义的锚框。相比之下,我们提出的检测器FCOS不含锚框,也无需生成提议。通过消除预定义的锚框集,FCOS完全避免了与锚框相关的复杂计算,比如在训练期间计算交并比(IoU)分数。更重要的是,我们还避免了所有与锚框相关的超参数,这些超参数通常对最终检测性能很敏感。仅通过后处理非极大值抑制(NMS),我们展示了一个更简单、灵活的检测框架,其检测精度得到了提高。我们希望所提出的FCOS框架可以作为许多其他实例级任务的一种简单而强大的替代方案。代码可在以下网址获取:git.io/AdelaiDet 。