Department of General Psychology, University of Padua, Via Venezia, 8, 35131, Padua, PD, Italy.
DISFOR, University of Genoa, Genova, Italy.
Behav Res Methods. 2021 Oct;53(5):1954-1972. doi: 10.3758/s13428-021-01549-x. Epub 2021 Mar 10.
Poor response to treatment is a defining characteristic of reading disorder. In the present systematic review and meta-analysis, we found that the overall average effect size for treatment efficacy was modest, with a mean standardized difference of 0.38. Small true effects, combined with the difficulty to recruit large samples, seriously challenge researchers planning to test treatment efficacy in dyslexia and potentially in other learning disorders. Nonetheless, most published studies claim effectiveness, generally based on liberal use of multiple testing. This inflates the risk that most statistically significant results are associated with overestimated effect sizes. To enhance power, we propose the strategic use of repeated measurements with mixed-effects modelling. This novel approach would enable us to estimate both individual parameters and population-level effects more reliably. We suggest assessing a reading outcome not once, but three times, at pre-treatment and three times at post-treatment. Such design would require only modest additional efforts compared to current practices. Based on this, we performed ad hoc a priori design analyses via simulation studies. Results showed that using the novel design may allow one to reach adequate power even with low sample sizes of 30-40 participants (i.e., 15-20 participants per group) for a typical effect size of d = 0.38. Nonetheless, more conservative assumptions are warranted for various reasons, including a high risk of publication bias in the extant literature. Our considerations can be extended to intervention studies of other types of neurodevelopmental disorders.
治疗反应不佳是阅读障碍的一个显著特征。在本次系统回顾和荟萃分析中,我们发现治疗效果的总体平均效应大小适中,平均标准化差异为 0.38。小的真实效应,加上难以招募到大样本,严重挑战了计划在阅读障碍和其他学习障碍中测试治疗效果的研究人员。尽管如此,大多数已发表的研究声称有效,通常基于对多次测试的自由使用。这增加了大多数具有统计学意义的结果与高估的效应大小相关的风险。为了提高效能,我们建议战略性地使用混合效应模型进行重复测量。这种新方法将使我们能够更可靠地估计个体参数和群体水平的效应。我们建议评估阅读结果,而不是一次,而是在治疗前和治疗后各进行三次。与当前实践相比,这种设计只需适度增加额外的努力。基于此,我们通过模拟研究进行了特定的事前设计分析。结果表明,即使使用 30-40 名参与者(即每组 15-20 名参与者)的低样本量,使用新设计也可能达到足够的效能,对于典型的效应大小 d=0.38。然而,出于各种原因,包括现有文献中存在发表偏倚的高风险,更保守的假设是合理的。我们的考虑可以扩展到其他类型的神经发育障碍的干预研究。