Department of Health Law, Policy and Management, Boston University School of Public Health, Boston, Massachusetts.
Floating Hospital for Children, Division of Developmental-Behavioral Pediatrics, Tufts University School of Medicine and Medical Center, Boston, Massachusetts.
JAMA Pediatr. 2020 Apr 1;174(4):366-374. doi: 10.1001/jamapediatrics.2019.6000.
Universal developmental screening is widely recommended, yet studies of the accuracy of commonly used questionnaires reveal mixed results, and previous comparisons of these questionnaires are hampered by important methodological differences across studies.
To compare the accuracy of 3 developmental screening instruments as standardized tests of developmental status.
DESIGN, SETTING, AND PARTICIPANTS: This cross-sectional diagnostic accuracy study recruited consecutive parents in waiting rooms at 10 pediatric primary care offices in eastern Massachusetts between October 1, 2013, and January 31, 2017. Parents were included if they were sufficiently literate in the English or Spanish language to complete a packet of screening questionnaires and if their child was of eligible age. Parents completed all questionnaires in counterbalanced order. Participants who screened positive on any questionnaire plus 10% of those who screened negative on all questionnaires (chosen at random) were invited to complete developmental testing. Analyses were weighted for sampling and nonresponse and were conducted from October 1, 2013, to January 31, 2017.
The 3 screening instruments used were the Ages & Stages Questionnaire, Third Edition (ASQ-3); Parents' Evaluation of Developmental Status (PEDS); and Survey of Well-being of Young Children (SWYC): Milestones.
Reference tests administered were Bayley Scales of Infant and Toddler Development, Third Edition, for children aged 0 to 42 months, and Differential Ability Scales, Second Edition, for older children. Age-standardized scores were used as indicators of mild (80-89), moderate (70-79), or severe (<70) delays.
A total of 1495 families of children aged 9 months to 5.5 years participated. The mean (SD) age of the children at enrollment was 2.6 (1.3) years, and 779 (52.1%) were male. Parent respondents were primarily female (1325 [88.7%]), with a mean (SD) age of 33.4 (6.3) years. Of the 20.5% to 29.0% of children with a positive score on each questionnaire, 35% to 60% also received a positive score on a second questionnaire, demonstrating moderate co-occurrence. Among younger children (<42 months), the specificity of the ASQ-3 (89.4%; 95% CI, 85.9%-92.1%) and SWYC Milestones (89.0%; 95% CI, 86.1%-91.4%) was higher than that of the PEDS (79.6%; 95% CI, 75.7%-83.1%; P < .001 and P = .002, respectively), but differences in sensitivity were not statistically significant. Among older children (43-66 months), specificity of the ASQ-3 (92.1%; 95% CI, 85.1%-95.9%) was higher than that of the SWYC Milestones (70.7%; 95% CI, 60.9%-78.8%) and the PEDS (73.7%; 95% CI, 64.3%-81.3%; P < .001), but sensitivity to mild delays of the SWYC Milestones (54.8%; 95% CI, 38.1%-70.4%) and of the PEDS (61.8%; 95% CI, 43.1%-77.5%) was higher than that of the ASQ-3 (23.5%; 95% CI, 9.0%-48.8%; P = .012 and P = .002, respectively). Sensitivity exceeded 70% only with respect to severe delays, with 73.7% (95% CI, 50.1%-88.6%) for the SWYC Milestones among younger children, 78.9% (95% CI, 55.4%-91.9%) for the PEDS among younger children, and 77.8% (95% CI, 41.8%-94.5%) for the PEDS among older children. Attending to parents' concerns was associated with increased sensitivity of all questionnaires.
This study found that 3 frequently used screening questionnaires offer adequate specificity but modest sensitivity for detecting developmental delays among children aged 9 months to 5 years. The results suggest that trade-offs in sensitivity and specificity occurred among the questionnaires, with no one questionnaire emerging superior overall.
普遍发育筛查得到广泛推荐,然而,对常用问卷准确性的研究结果喜忧参半,并且之前对这些问卷的比较受到研究之间重要方法学差异的阻碍。
比较 3 种发育筛查工具作为发育状况的标准化测试标准的准确性。
设计、地点和参与者:这项横断面诊断准确性研究于 2013 年 10 月 1 日至 2017 年 1 月 31 日在马萨诸塞州东部的 10 家儿科初级保健办公室的等候室连续招募父母。如果父母的英语或西班牙语读写能力足以完成一套筛查问卷,并且其孩子年龄合适,则包括他们。父母以平衡的顺序完成所有问卷。对任何问卷呈阳性加上所有问卷呈阴性的 10%的父母(随机选择)被邀请完成发育测试。分析针对抽样和无反应进行了加权,于 2013 年 10 月 1 日至 2017 年 1 月 31 日进行。
使用的 3 种筛查工具是年龄与阶段问卷,第三版(ASQ-3);父母发育状况评估(PEDS);以及幼儿福利调查(SWYC):里程碑。
对儿童进行了贝利婴幼儿发育量表第三版和儿童发育量表第二版测试。年龄标准化分数是轻度(80-89)、中度(70-79)或重度(<70)发育迟缓的指标。
共有 1495 个 9 个月至 5.5 岁儿童的家庭参加。儿童入组时的平均(SD)年龄为 2.6(1.3)岁,779 名(52.1%)为男性。父母受访者主要为女性(1325 名[88.7%]),平均(SD)年龄为 33.4(6.3)岁。在每个问卷呈阳性的 20.5%至 29.0%的儿童中,35%至 60%的儿童也在第二个问卷上呈阳性,表明中度共现。在年龄较小的儿童(<42 个月)中,ASQ-3(89.4%;95%CI,85.9%-92.1%)和 SWYC 里程碑(89.0%;95%CI,86.1%-91.4%)的特异性高于 PEDS(79.6%;95%CI,75.7%-83.1%;P<.001 和 P=.002),但敏感性差异无统计学意义。在年龄较大的儿童(43-66 个月)中,ASQ-3(92.1%;95%CI,85.1%-95.9%)的特异性高于 SWYC 里程碑(70.7%;95%CI,60.9%-78.8%)和 PEDS(73.7%;95%CI,64.3%-81.3%;P<.001),但 SWYC 里程碑(54.8%;95%CI,38.1%-70.4%)和 PEDS(61.8%;95%CI,43.1%-77.5%)对轻度延迟的敏感性高于 ASQ-3(23.5%;95%CI,9.0%-48.8%;P=.012 和 P=.002)。只有在严重延迟的情况下敏感性才超过 70%,年龄较小的儿童中 SWYC 里程碑的敏感性为 73.7%(95%CI,50.1%-88.6%),年龄较小的儿童中 PEDS 的敏感性为 78.9%(95%CI,55.4%-91.9%),年龄较大的儿童中 PEDS 的敏感性为 77.8%(95%CI,41.8%-94.5%)。关注父母的担忧与所有问卷的敏感性增加有关。
本研究发现,3 种常用的筛查问卷在检测 9 个月至 5 岁儿童发育迟缓方面具有足够的特异性,但敏感性适中。结果表明,问卷之间存在敏感性和特异性的权衡,没有一种问卷总体上具有优势。