Department of Pediatrics, Stanford University, USA; Department of Biomedical Data Science, Stanford University, USA.
Department of Mathematics, Stanford University, USA.
Artif Intell Med. 2019 Jul;98:77-86. doi: 10.1016/j.artmed.2019.06.004. Epub 2019 Jul 6.
Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by repetitive behaviors, narrow interests, and deficits in social interaction and communication ability. An increasing emphasis is being placed on the development of innovative digital and mobile systems for their potential in therapeutic applications outside of clinical environments. Due to recent advances in the field of computer vision, various emotion classifiers have been developed, which have potential to play a significant role in mobile screening and therapy for developmental delays that impair emotion recognition and expression. However, these classifiers are trained on datasets of predominantly neurotypical adults and can sometimes fail to generalize to children with autism. The need to improve existing classifiers and develop new systems that overcome these limitations necessitates novel methods to crowdsource labeled emotion data from children. In this paper, we present a mobile charades-style game, Guess What?, from which we derive egocentric video with a high density of varied emotion from a 90-second game session. We then present a framework for semi-automatic labeled frame extraction from these videos using meta information from the game session coupled with classification confidence scores. Results show that 94%, 81%, 92%, and 56% of frames were automatically labeled correctly for categories disgust, neutral, surprise, and scared respectively, though performance for angry and happy did not improve significantly from the baseline.
自闭症谱系障碍(ASD)是一种神经发育障碍,其特征是重复行为、狭隘兴趣以及社交互动和沟通能力缺陷。人们越来越重视开发创新的数字和移动系统,因为它们有可能在临床环境之外的治疗应用中发挥作用。由于计算机视觉领域的最新进展,已经开发出各种情绪分类器,它们有可能在移动筛查和治疗发育迟缓方面发挥重要作用,这些发育迟缓会损害情绪识别和表达能力。然而,这些分类器是在主要由神经典型成年人组成的数据集上进行训练的,有时无法推广到患有自闭症的儿童身上。需要改进现有的分类器并开发新的系统来克服这些限制,这就需要新的方法从儿童那里众包标记的情绪数据。在本文中,我们提出了一种移动猜谜游戏 Guess What?,我们从 90 秒的游戏中获取了自我中心的视频,这些视频包含了高密度的各种情绪。然后,我们提出了一种使用来自游戏会话的元信息和分类置信度分数从这些视频中半自动提取标记帧的框架。结果表明,对于类别厌恶、中性、惊讶和恐惧,分别有 94%、81%、92%和 56%的帧可以自动正确标记,尽管愤怒和高兴的分类性能与基线相比没有显著提高。