IEEE Trans Pattern Anal Mach Intell. 2022 Mar;44(3):1604-1622. doi: 10.1109/TPAMI.2020.3021025. Epub 2022 Feb 3.
Generic object counting in natural scenes is a challenging computer vision problem. Existing approaches either rely on instance-level supervision or absolute count information to train a generic object counter. We introduce a partially supervised setting that significantly reduces the supervision level required for generic object counting. We propose two novel frameworks, named lower-count (LC) and reduced lower-count (RLC), to enable object counting under this setting. Our frameworks are built on a novel dual-branch architecture that has an image classification and a density branch. Our LC framework reduces the annotation cost due to multiple instances in an image by using only lower-count supervision for all object categories. Our RLC framework further reduces the annotation cost arising from large numbers of object categories in a dataset by only using lower-count supervision for a subset of categories and class-labels for the remaining ones. The RLC framework extends our dual-branch LC framework with a novel weight modulation layer and a category-independent density map prediction. Experiments are performed on COCO, Visual Genome and PASCAL 2007 datasets. Our frameworks perform on par with state-of-the-art approaches using higher levels of supervision. Additionally, we demonstrate the applicability of our LC supervised density map for image-level supervised instance segmentation.
自然场景中的通用物体计数是计算机视觉领域的一个具有挑战性的问题。现有的方法要么依赖于实例级别的监督,要么依赖于绝对计数信息来训练通用物体计数器。我们引入了一种部分监督的设置,大大降低了通用物体计数所需的监督水平。我们提出了两种新的框架,分别称为低计数(LC)和减少的低计数(RLC),以在这种设置下实现物体计数。我们的框架建立在一种新的双分支架构上,该架构具有图像分类和密度分支。我们的 LC 框架通过仅对所有物体类别进行低计数监督,减少了由于图像中存在多个实例而导致的注释成本。我们的 RLC 框架通过仅对部分类别进行低计数监督,并对其余类别的类别标签进行监督,进一步降低了由于数据集中存在大量物体类别而导致的注释成本。RLC 框架通过一个新的权重调制层和一个与类别无关的密度图预测,扩展了我们的双分支 LC 框架。在 COCO、Visual Genome 和 PASCAL 2007 数据集上进行了实验。我们的框架在使用更高水平监督的情况下与最先进的方法表现相当。此外,我们还展示了我们的 LC 监督密度图在图像级监督实例分割中的应用。