Department of Linguistics, University of Washington, Seattle 98195 WA, USA.
Clinical Research Division, Fred Hutchinson Cancer Center, Seattle, WA, USA.
Infant Behav Dev. 2024 Jun;75:101943. doi: 10.1016/j.infbeh.2024.101943. Epub 2024 Mar 27.
In North America, the characteristics of a child's language environment predict language outcomes. For example, differences in bilingual language exposure, exposure to electronic media, and exposure to child-directed speech (CDS) relate to children's language growth. Recently, these predictors have been studied through the use of daylong recordings, followed by manual annotation of audio samples selected from these recordings. Using a dataset of daylong recordings collected from bilingually raised infants in the United States as an example, we ask whether two of the most commonly used sampling methods, random sampling and sampling based on high adult speech, differ from each other with regard to estimating the frequencies of specific language behaviors. Daylong recordings from 37 Spanish-English speaking families with infants between 4 and 22 months of age were analyzed. From each child's recording, samples were extracted in two ways (at random/based on high adult speech) and then annotated for Language (Spanish/English/Mixed), CDS, Electronic Media, Social Context, Turn-Taking, and Infant Babbling. Correlation and agreement analyses were performed, in addition to paired sample t-tests, to assess how the choice of one or the other sampling method may affect the estimates. For most behaviors studied, correlation and agreement between the two sampling methods was high (Pearson r values between 0.79 and 0.99 for 16 of 17 measures; Intraclass Correlation Coefficient values between 0.78 and 0.99 for 13 of 17 measures). However, interesting between-sample differences also emerged: the degree of language mixing, the amount of CDS, and the number of conversational turns were all significantly higher when sampling was performed based on high adult speech compared to random sampling. By contrast, the presence of electronic media and one-on-one social contexts was higher when sampling was performed at random. We discuss advantages of choosing one sampling technique over the other, depending on the research question and variables at hand.
在北美,儿童语言环境的特点可以预测其语言发展结果。例如,双语语言接触、电子媒体接触和儿童导向性言语(CDS)接触的差异与儿童的语言发展有关。最近,这些预测因素已通过使用全天录音,然后手动注释从这些录音中选择的音频样本的方法进行了研究。我们以从美国双语儿童中收集的全天录音数据集为例,询问两种最常用的采样方法(随机采样和基于高成人语音的采样)在估计特定语言行为的频率方面是否存在差异。分析了来自 37 个西班牙语-英语双语家庭的 4 至 22 个月大婴儿的全天录音。从每个孩子的录音中,以两种方式(随机/基于高成人语音)提取样本,然后对语言(西班牙语/英语/混合语)、CDS、电子媒体、社会背景、轮流发言和婴儿咿呀学语进行注释。除了配对样本 t 检验外,还进行了相关性和一致性分析,以评估选择一种或另一种采样方法如何影响估计值。对于研究的大多数行为,两种采样方法之间的相关性和一致性都很高(17 项测量中的 16 项的 Pearson r 值在 0.79 到 0.99 之间;17 项测量中的 13 项的 ICC 值在 0.78 到 0.99 之间)。然而,也出现了有趣的样本间差异:与随机采样相比,基于高成人语音采样时,语言混合程度、CDS 量和对话轮次都显著更高。相比之下,当随机采样时,电子媒体和一对一社会背景的出现频率更高。我们将讨论根据研究问题和手头的变量选择一种采样技术而不是另一种采样技术的优势。