Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, BN1 9RH, UK.
School of Life Sciences, Joseph Banks Laboratories, University of Lincoln, Beevor Street, Lincoln, LN6 7DL, UK.
Anim Cogn. 2021 Sep;24(5):947-956. doi: 10.1007/s10071-021-01490-8. Epub 2021 Mar 9.
Quantifying the intensity of animals' reaction to stimuli is notoriously difficult as classic unidimensional measures of responses such as latency or duration of looking can fail to capture the overall strength of behavioural responses. More holistic rating can be useful but have the inherent risks of subjective bias and lack of repeatability. Here, we explored whether crowdsourcing could be used to efficiently and reliably overcome these potential flaws. A total of 396 participants watched online videos of dogs reacting to auditory stimuli and provided 23,248 ratings of the strength of the dogs' responses from zero (default) to 100 using an online survey form. We found that raters achieved very high inter-rater reliability across multiple datasets (although their responses were affected by their sex, age, and attitude towards animals) and that as few as 10 raters could be used to achieve a reliable result. A linear mixed model applied to PCA components of behaviours discovered that the dogs' facial expressions and head orientation influenced the strength of behaviour ratings the most. Further linear mixed models showed that that strength of behaviour ratings was moderately correlated to the duration of dogs' reactions but not to dogs' reaction latency (from the stimulus onset). This suggests that observers' ratings captured consistent dimensions of animals' responses that are not fully represented by more classic unidimensional metrics. Finally, we report that overall participants strongly enjoyed the experience. Thus, we suggest that using crowdsourcing can offer a useful, repeatable tool to assess behavioural intensity in experimental or observational studies where unidimensional coding may miss nuance, or where coding multiple dimensions may be too time-consuming.
量化动物对刺激的反应强度是非常困难的,因为经典的单一维度反应测量方法,如潜伏期或注视持续时间,可能无法捕捉到行为反应的整体强度。更全面的评分可能是有用的,但存在主观偏差和缺乏可重复性的固有风险。在这里,我们探讨了众包是否可以有效地、可靠地克服这些潜在的缺陷。共有 396 名参与者在线观看了狗对听觉刺激的反应视频,并使用在线调查表单对狗的反应强度从 0(默认)到 100 进行了 23,248 次评分。我们发现,评分者在多个数据集上实现了非常高的组内可靠性(尽管他们的反应受到他们的性别、年龄和对动物的态度的影响),并且只需要 10 名评分者就可以获得可靠的结果。应用于行为 PCA 成分的线性混合模型发现,狗的面部表情和头部方向对行为评分的强度影响最大。进一步的线性混合模型表明,行为评分的强度与狗的反应持续时间中度相关,但与狗的反应潜伏期(从刺激开始)无关。这表明观察者的评分捕捉到了动物反应的一致维度,而这些维度不能完全由更经典的单一维度指标来表示。最后,我们报告说,参与者总体上非常喜欢这种体验。因此,我们建议在实验或观察研究中,众包可以提供一种有用的、可重复的工具来评估行为强度,在这些研究中,单一维度的编码可能会错过细微差别,或者对多个维度进行编码可能太耗时。