Suppr超能文献

解耦的两阶段人群计数及超越。

Decoupled Two-Stage Crowd Counting and Beyond.

出版信息

IEEE Trans Image Process. 2021;30:2862-2875. doi: 10.1109/TIP.2021.3055631. Epub 2021 Feb 12.

Abstract

One of appealing approaches to counting dense objects, such as crowd, is density map estimation. Density maps, however, present ambiguous appearance cues in congested scenes, rendering infeasibility in identifying individuals and difficulties in diagnosing errors. Inspired by an observation that counting can be interpreted as a two-stage process, i.e., identifying possible object regions and counting exact object numbers, we introduce a probabilistic intermediate representation termed the probability map that depicts the probability of each pixel being an object. This representation allows us to decouple counting into probability map regression (PMR) and count map regression (CMR). We therefore propose a novel decoupled two-stage counting (D2C) framework that sequentially regresses the probability map and learns a counter conditioned on the probability map. Given the probability map and the count map, a peak point detection algorithm is derived to localize each object with a point under the guidance of local counts. An advantage of D2C is that the counter can be learned reliably with additional synthesized probability maps. This addresses important data deficiency and sample imbalanced problems in counting. Our framework also enables easy diagnoses and analyses of error patterns. For instance, we find that, the counter per se is sufficiently accurate, while the bottleneck appears to be PMR. We further instantiate a network D2CNet in our framework and report state-of-the-art counting and localization performance across 6 crowd counting benchmarks. Since the probability map is a representation independent of visual appearance, D2CNet also exhibits remarkable cross-dataset transferability. Code and pretrained models are made available at: https://git.io/d2cnet.

摘要

一种用于计数密集目标(如人群)的吸引人的方法是密度图估计。然而,在拥挤的场景中,密度图呈现出模糊的外观线索,使得识别个体变得不可行,并且难以诊断错误。受计数可以解释为两个阶段的过程的观察启发,即识别可能的对象区域和计数确切的对象数量,我们引入了一种称为概率图的概率中间表示,该图描绘了每个像素成为对象的概率。这种表示允许我们将计数解耦为概率图回归(PMR)和计数图回归(CMR)。因此,我们提出了一种新颖的解耦两阶段计数(D2C)框架,该框架依次回归概率图,并根据概率图学习计数器。给定概率图和计数图,我们推导了一个峰值点检测算法,该算法在局部计数的指导下,用一个点来定位每个对象。D2C 的一个优点是,可以通过额外的合成概率图可靠地学习计数器。这解决了计数中重要的数据不足和样本不平衡问题。我们的框架还可以方便地诊断和分析错误模式。例如,我们发现,计数器本身已经足够准确,而瓶颈似乎是 PMR。我们进一步在我们的框架中实例化了一个网络 D2CNet,并在 6 个人群计数基准上报告了最先进的计数和定位性能。由于概率图是一种与视觉外观无关的表示,D2CNet 还表现出显著的跨数据集可转移性。代码和预训练模型可在以下网址获得:https://git.io/d2cnet。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验