Setti Francesco, Cristani Marco
IEEE Trans Pattern Anal Mach Intell. 2019 Mar;41(3):566-580. doi: 10.1109/TPAMI.2018.2806970. Epub 2018 Feb 16.
The detection of groups of individuals is attracting the attention of many researchers in diverse fields, from automated surveillance to human-computer interaction, with a growing number of approaches published every year. Unexpectedly, the evaluation metrics for this problem are not consolidated, with some measures inherited from the people detection field, other from clustering, other designed specifically for a particular approach, thus lacking in generalization and making the comparisons between different approaches hard to be carried out. Moreover, most of the existent metrics are scarcely expressive, addressing groups as they are atomic entities, ignoring that they may have different cardinalities, and that group detection approaches may fail in capturing the exact number of individuals that compose it. This paper fills this gap presenting the GROup DEtection (GRODE) metrics, which formally define precision and recall on the groups, including the group cardinality as a variable. This gives the possibility to investigate aspects never considered so far, such as the tendency of a method of over- or under-segmenting, or of better dealing with specific group cardinalities. The GRODE metrics have been evaluated first on controlled scenarios, where the differences with alternative metrics are evident. Then, the metrics have been applied to eight approaches of group detection, on eight public datasets, providing a fresh-new panorama of the state-of-the-art, discovering interesting strengths and pitfalls of the recent approaches.
个体群体的检测正吸引着不同领域众多研究人员的关注,从自动监控到人机交互,每年都有越来越多的方法被发表。出乎意料的是,针对这个问题的评估指标尚未统一,有些指标是从人员检测领域继承而来,有些来自聚类,还有些是专门为特定方法设计的,因此缺乏通用性,使得不同方法之间难以进行比较。此外,大多数现有指标几乎没有表现力,将群体视为原子实体,忽略了它们可能具有不同的基数,以及群体检测方法可能无法准确捕捉组成群体的个体数量。本文填补了这一空白,提出了群体检测(GRODE)指标,该指标正式定义了群体的精确率和召回率,将群体基数作为一个变量。这使得有可能研究到目前为止从未考虑过的方面,例如一种方法过度分割或分割不足的倾向,或者更好地处理特定群体基数的能力。GRODE指标首先在受控场景下进行了评估,在这些场景中与其他指标的差异很明显。然后,这些指标被应用于八个公开数据集上的八种群体检测方法,提供了一个全新的最新技术全景,发现了近期方法有趣的优点和缺陷。