George Washington University, Washington DC, USA.
Northwestern University, Evanston, USA.
Prev Sci. 2022 Nov;23(8):1343-1358. doi: 10.1007/s11121-022-01407-y. Epub 2022 Aug 30.
Clearinghouses develop scientific criteria that they then use to vet existing research studies on a program to reach a verdict about how evidence-based it is. This verdict is then recorded on a website in hopes that stakeholders in science, public policy, the media, and even the general public, will consult it. This paper (1) compares the causal design and analysis preferences of 13 clearinghouses that assess the effectiveness of social and behavioral development programs, (2) estimates how consistently these clearinghouses rank the same program, and then (3) uses case studies to probe why their conclusions differ. Most clearinghouses place their highest value on randomized control trials, but they differ in how they treat program implementation, quasi-experiments, and whether their highest program ratings require effects of a given size that independently replicate or that temporally persist. Of the 2525 social and behavioral development programs sampled over clearinghouses, 82% (n = 2069) were rated by a single clearinghouse. Of the 297 programs rated by two clearinghouses, agreement about program effectiveness was obtained for about 55% (n = 164), but the clearinghouses agreed much more on program ineffectiveness than effectiveness. Most of the inconsistency is due to clearinghouses' differences in requiring independently replicated and/or temporally sustained effects. Without scientific consensus about matters like these, "evidence-based" will remain more of an aspiration than achievement in the social and behavioral sciences.
文献信息中心制定科学标准,然后使用这些标准来审查关于某一计划的现有研究,以对其循证程度做出判断。然后,该判断将被记录在一个网站上,希望科学、公共政策、媒体,甚至是普通大众中的利益相关者能够查阅该网站。本文(1)比较了 13 个评估社会和行为发展计划有效性的文献信息中心的因果设计和分析偏好,(2)估计了这些文献信息中心对同一计划的排名是否一致,然后(3)使用案例研究来探究为什么它们的结论不同。大多数文献信息中心最看重随机对照试验,但它们在如何处理计划实施、准实验以及它们的最高计划评级是否需要独立重复或时间持续的特定大小的效果方面存在差异。在文献信息中心抽样的 2525 个社会和行为发展计划中,有 82%(n=2069)被一个文献信息中心评级。在被两个文献信息中心评级的 297 个计划中,约有 55%(n=164)的计划有效性得到了认同,但文献信息中心对计划无效性的认同度远高于有效性。这种不一致的大部分原因是文献信息中心在独立重复和/或时间持续效果方面的要求存在差异。如果在这些问题上没有达成科学共识,“循证”在社会和行为科学中仍然更多的是一种愿望,而不是成就。