Dorsett Richard, Hendra Richard, Robins Philip K
1 University of Westminster, London, United Kingdom.
2 MDRC, New York City, NY, USA.
Eval Rev. 2018 Oct-Dec;42(5-6):491-514. doi: 10.1177/0193841X16674395. Epub 2016 Oct 25.
Even a well-designed randomized control trial (RCT) study can produce ambiguous results. This article highlights a case in which full sample results from a large-scale RCT in the United Kingdom differ from results for a subsample of survey respondents.
Our objective is to ascertain the source of the discrepancy in inferences across data sources and, in doing so, to highlight important threats to the reliability of the causal conclusions derived from even the strongest research designs.
The study analyzes administrative data to shed light on the source of the differences between the estimates. We explore the extent to which heterogeneous treatment impacts and survey nonresponse might explain these differences. We suggest checks which assess the external validity of survey measured impacts, which in turn provides an opportunity to test the effectiveness of different weighting schemes to remove bias. The subjects included 6,787 individuals who participated in a large-scale social policy experiment.
Our results were not definitive but suggest nonresponse bias is the main source of the inconsistent findings.
The results caution against overconfidence in drawing conclusions from RCTs and highlight the need for great care to be taken in data collection and analysis. Particularly, given the modest size of impacts expected in most RCTs, small discrepancies in data sources can alter the results. Survey data remain important as a source of information on outcomes not recorded in administrative data. However, linking survey and administrative data is strongly recommended whenever possible.
即使是精心设计的随机对照试验(RCT)研究也可能产生不明确的结果。本文重点介绍了一个案例,即英国一项大规模随机对照试验的全样本结果与调查受访者子样本的结果有所不同。
我们的目标是确定不同数据源推断结果存在差异的根源,并在此过程中,突出即使是最强大的研究设计得出的因果结论可靠性所面临的重大威胁。
该研究分析行政数据,以阐明估计值之间差异的根源。我们探究异质性治疗效果和调查无应答在多大程度上可以解释这些差异。我们建议进行一些检验,以评估调查测量效果的外部有效性,这反过来又提供了一个机会来测试不同加权方案消除偏差的有效性。研究对象包括6787名参与大规模社会政策实验的个体。
我们的结果并不确定,但表明无应答偏差是结果不一致的主要根源。
研究结果告诫人们不要对随机对照试验得出的结论过度自信,并强调在数据收集和分析过程中需要格外谨慎。特别是,鉴于大多数随机对照试验预期的效果规模不大,数据源中的小差异可能会改变结果。调查数据作为行政数据未记录的结果信息来源仍然很重要。然而,只要有可能,强烈建议将调查数据与行政数据相链接。