Department of Linguistics, University of Washington, Seattle, WA, USA.
Fred Hutchinson Cancer Center, Clinical Research Division, Seattle, WA, USA.
Behav Res Methods. 2024 Mar;56(3):1936-1952. doi: 10.3758/s13428-023-02127-z. Epub 2023 May 5.
The Language ENvironment Analysis system (LENA) records children's language environment and provides an automatic estimate of adult-child conversational turn count (CTC) by automatically identifying adult and child speech in close temporal proximity. To assess the reliability of this measure, we examine correlation and agreement between LENA's CTC estimates and manual measurement of adult-child turn-taking in two corpora collected in the USA: a bilingual corpus of Spanish-English-speaking families with infants between 4 and 22 months (n = 37), and a corpus of monolingual families with English-speaking 5-year-olds (n = 56). In each corpus for each child, 100 30-second segments were extracted from daylong recordings in two ways, yielding a total of 9300 minutes of manually annotated audio. LENA's CTC estimate for the same segments was obtained through the LENA software. The two measures of CTC had low correlations for the segments from the monolingual 5-year-olds sampled in both ways, and somewhat higher correlations for the bilingual samples. LENA substantially overestimated CTC on average, relative to manual measurement, for three out of four analysis conditions, and limits of agreement were wide in all cases. Segment-level analyses demonstrated that accidental contiguity had the largest individual impact on LENA's average CTC error, affecting 12-17% of analyzed segments. Other factors significantly contributing to CTC error were speech from other children, presence of multiple adults, and presence of electronic media. These results indicate wide discrepancies between LENA's CTC estimates and manual CTCs, and call into question the comparability of LENA's CTC measure across participants, conditions, and developmental time points.
语言环境分析系统(LENA)记录儿童的语言环境,并通过自动识别时间接近的成人和儿童语音,自动估计成人与儿童对话的轮次计数(CTC)。为了评估该测量方法的可靠性,我们在两个美国语料库中检验了 LENA 的 CTC 估计值与成人与儿童轮次转换的手动测量值之间的相关性和一致性,这两个语料库分别是:一个包含 4 至 22 个月大的西班牙语-英语双语婴儿家庭的双语语料库(n=37)和一个包含讲英语的 5 岁儿童的单语语料库(n=56)。对于每个语料库中的每个孩子,从全天记录中以两种方式提取了 100 个 30 秒的片段,共产生了 9300 分钟的手动标注音频。通过 LENA 软件获得了相同片段的 LENA 的 CTC 估计值。对于以两种方式采样的单语 5 岁儿童的片段,这两个 CTC 测量值的相关性较低,而对于双语样本,相关性则略高。在四种分析条件中的三种条件下,LENA 的 CTC 估计值相对于手动测量值平均偏高,在所有情况下,一致性界限都很宽。片段级分析表明,偶然的连续性对 LENA 的平均 CTC 误差有最大的个体影响,影响了 12-17%的分析片段。其他对 CTC 误差有显著影响的因素是来自其他儿童的语音、存在多个成人和存在电子媒体。这些结果表明 LENA 的 CTC 估计值与手动 CTC 之间存在很大差异,并且对 LENA 的 CTC 测量在参与者、条件和发展时间点之间的可比性提出了质疑。