Robótica y Manufactura Avanzada, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Ramos Arizpe, Coahuila, 25900, México.
Sci Data. 2023 Aug 12;10(1):538. doi: 10.1038/s41597-023-02435-1.
We documented the relabeling process for a subset of a renowned database for emotion-in-context recognition, with the aim of promoting reliability in final labels. To this end, emotion categories were organized into eight groups, while a large number of participants was requested for tagging. A strict control strategy was performed along the experiments, whose duration was 13.45 minutes average per day. Annotators were free to participate in any of the daily experiments (the average number of participants was 28), and a Z-Score filtering technique was implemented to keep trustworthiness of annotations. As a result, the value of the agreement parameter Fleiss' Kapa increasingly varied from slight to almost perfect, revealing a coherent diversity of the experiments. Our results support the hypothesis that a small number of categories and a large number of voters benefit reliability of annotations in contextual emotion imagery.
我们记录了一个著名的情感语境识别数据库子集的重新标记过程,旨在提高最终标签的可靠性。为此,我们将情绪类别组织成八个组,并要求大量参与者进行标记。在实验过程中,我们采取了严格的控制策略,平均每天的持续时间为 13.45 分钟。注释者可以自由参与任何日常实验(平均参与者人数为 28 人),并且实施了 Z 分数过滤技术来保持注释的可信度。结果表明,Fleiss'Kapa 一致性参数的值从轻微到几乎完美不等,这表明实验具有一致的多样性。我们的研究结果支持了这样一种假设,即少量的类别和大量的投票者有利于情感语境图像注释的可靠性。