Department of Communicative Sciences and Disorders, Michigan State University, 104 Oyer Centerp, East Lansing, MI, 48824, USA.
Department of Otolaryngology - Head and Neck Surgery, The Ohio State University, Columbus, OH, USA.
Behav Res Methods. 2021 Feb;53(1):113-138. doi: 10.3758/s13428-020-01419-y.
Automatic speech processing devices have become popular for quantifying amounts of ambient language input to children in their home environments. We assessed error rates for language input estimates for the Language ENvironment Analysis (LENA) audio processing system, asking whether error rates differed as a function of adult talkers' gender and whether they were speaking to children or adults. Audio was sampled from within LENA recordings from 23 families with children aged 4-34 months. Human coders identified vocalizations by adults and children, counted intelligible words, and determined whether adults' speech was addressed to children or adults. LENA's classification accuracy was assessed by parceling audio into 100-ms frames and comparing, for each frame, human and LENA classifications. LENA correctly classified adult speech 67% of the time across families (average false negative rate: 33%). LENA's adult word count showed a mean +47% error relative to human counts. Classification and Adult Word Count error rates were significantly affected by talkers' gender and whether speech was addressed to a child or an adult. The largest systematic errors occurred when adult females addressed children. Results show LENA's classifications and Adult Word Count entailed random - and sometimes large - errors across recordings, as well as systematic errors as a function of talker gender and addressee. Due to systematic and sometimes high error in estimates of amount of adult language input, relying on this metric alone may lead to invalid clinical and/or research conclusions. Further validation studies and circumspect usage of LENA are warranted.
自动语音处理设备已成为量化儿童家庭环境中环境语言输入量的热门工具。我们评估了语言环境分析 (LENA) 音频处理系统的语言输入估计错误率,询问错误率是否因成人说话者的性别而异,以及他们是在与儿童还是成人说话。音频是从 23 个有 4-34 个月大儿童的家庭的 LENA 录音中采样的。人类编码员通过成人和儿童的发声,统计可理解的单词,并确定成人的讲话是针对儿童还是成人。LENA 的分类准确性是通过将音频分成 100 毫秒的帧来评估的,并比较每个帧的人类和 LENA 分类。LENA 在家庭范围内正确分类成人讲话的准确率为 67%(平均假阴性率为 33%)。LENA 的成人单词计数相对于人类计数的平均误差为+47%。分类和成人单词计数错误率受到说话者性别以及讲话是针对儿童还是成人的影响。当成年女性与儿童交流时,会产生最大的系统误差。结果表明,LENA 的分类和成人单词计数在录音中会产生随机的——有时是很大的——错误,以及作为说话者性别和对象的函数的系统错误。由于对成人语言输入量的估计存在系统性和有时较高的误差,仅依赖该指标可能会导致无效的临床和/或研究结论。需要进一步验证研究和谨慎使用 LENA。