Catherine Han Y, Dworak Elizabeth M, Mansolf Maxwell, Gershon Richard C, Kaat Aaron J
Department of Medical Social Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, United States.
Department of Medical Social Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, United States.
Infant Behav Dev. 2025 Sep;80:102122. doi: 10.1016/j.infbeh.2025.102122. Epub 2025 Aug 8.
The NIH Baby Toolbox® offers assessments spanning Cognition, Motor, and Social-Emotional Functioning domains and includes both measure-level and composite scores. Here, we describe the creation of eight composite scores, reflecting Language, Executive Function/Memory, Math, Cognition, Motor, Self-Regulation, Negative Affect, and Social Communication - key constructs in infant and toddler development. Using composite scores rather than measure-specific scores can offer a more holistic evaluation of functioning by combining measures, reducing the impact of outliers and measurement error. Using factor analysis, data from the original Baby Toolbox norming study (N = 2515 recruited; n = 2479 with at least one composite score; n = 2025 English, n = 454 Spanish) were analyzed to derive composite scores. Analyses were conducted on regression-weighted factor scores for individual measures to define composites. Psychometric properties were assessed using composite reliability, test-retest reliability, and external validation with the Ages and Stages Questionnaire (3rd edition), Bayley Scales of Infant and Toddler Development (4th edition), and the child's age. Composite scores demonstrated excellent composite reliability, moderate to strong test-retest reliability, minimal practice effects for most scores, moderate and significant relations with most external measures, and moderate to strong correlations with age for abilities expected to improve with age. The Baby Toolbox composite scores offer a reliable, valid tool for assessing key areas of infant and toddler development. The evidence supporting their reliability and validity demonstrates their effectiveness as indicators of early cognitive, motor, and social-emotional growth, making them useful in clinical, research, and educational settings. This framework helps deepen our understanding and practical evaluation of developmental milestones during the early years.
美国国立卫生研究院婴儿工具箱(NIH Baby Toolbox®)提供涵盖认知、运动和社会情感功能领域的评估,并包括测量水平分数和综合分数。在此,我们描述了八个综合分数的创建,这些分数反映了语言、执行功能/记忆、数学、认知、运动、自我调节、消极情绪和社会沟通——这些都是婴幼儿发育中的关键结构。使用综合分数而非特定测量分数,可以通过合并各项测量指标,更全面地评估功能,减少异常值和测量误差的影响。通过因子分析,对原始婴儿工具箱常模研究的数据(招募了N = 2515名;n = 2479名至少有一个综合分数;n = 2025名说英语,n = 454名说西班牙语)进行分析,以得出综合分数。对各个测量指标的回归加权因子分数进行分析,以定义综合分数。使用综合信度、重测信度以及与《年龄与阶段问卷》(第3版)、《贝利婴幼儿发展量表》(第4版)和儿童年龄进行外部验证来评估心理测量特性。综合分数显示出优异的综合信度、中等到强的重测信度、大多数分数的练习效应极小、与大多数外部测量指标存在中度且显著的关系,以及对于预期随年龄增长而提高的能力与年龄存在中度到强的相关性。婴儿工具箱综合分数为评估婴幼儿发育的关键领域提供了一个可靠、有效的工具。支持其可靠性和有效性的证据表明,它们作为早期认知、运动和社会情感发展指标的有效性,使其在临床、研究和教育环境中很有用。这个框架有助于加深我们对早年发育里程碑的理解和实际评估。