Wintle Bonnie C, Smith Eden T, Bush Martin, Mody Fallon, Wilkinson David P, Hanea Anca M, Marcoci Alexandru, Fraser Hannah, Hemming Victoria, Thorn Felix Singleton, McBride Marissa F, Gould Elliot, Head Andrew, Hamilton Daniel G, Kambouris Steven, Rumpff Libby, Hoekstra Rink, Burgman Mark A, Fidler Fiona
MetaMelb Research Initiative, School of Ecosystem and Forest Sciences, University of Melbourne, Parkville 3010, Australia.
MetaMelb Research Initiative, School of Historical and Philosophical Studies, University of Melbourne, Parkville 3010, Australia.
R Soc Open Sci. 2023 Jun 7;10(6):221553. doi: 10.1098/rsos.221553. eCollection 2023 Jun.
This paper explores judgements about the replicability of social and behavioural sciences research and what drives those judgements. Using a mixed methods approach, it draws on qualitative and quantitative data elicited from groups using a structured approach called the IDEA protocol ('investigate', 'discuss', 'estimate' and 'aggregate'). Five groups of five people with relevant domain expertise evaluated 25 research claims that were subject to at least one replication study. Participants assessed the probability that each of the 25 research claims would replicate (i.e. that a replication study would find a statistically significant result in the same direction as the original study) and described the reasoning behind those judgements. We quantitatively analysed possible correlates of predictive accuracy, including self-rated expertise and updating of judgements after feedback and discussion. We qualitatively analysed the reasoning data to explore the cues, heuristics and patterns of reasoning used by participants. Participants achieved 84% classification accuracy in predicting replicability. Those who engaged in a greater breadth of reasoning provided more accurate replicability judgements. Some reasons were more commonly invoked by more accurate participants, such as 'effect size' and 'reputation' (e.g. of the field of research). There was also some evidence of a relationship between statistical literacy and accuracy.
本文探讨了关于社会科学和行为科学研究可重复性的判断以及驱动这些判断的因素。采用混合方法,它利用了通过一种名为IDEA协议(“调查”、“讨论”、“估计”和“汇总”)的结构化方法从各小组中获取的定性和定量数据。五组每组五名具有相关领域专业知识的人员对25项研究主张进行了评估,这些主张至少经历了一项重复研究。参与者评估了这25项研究主张中每一项能够被重复验证的概率(即重复研究能得出与原始研究方向相同的具有统计学显著性的结果),并描述了这些判断背后的推理过程。我们定量分析了预测准确性的可能相关因素,包括自我评定的专业知识以及在反馈和讨论后判断的更新情况。我们定性分析了推理数据,以探究参与者使用的线索、启发法和推理模式。参与者在预测可重复性方面达到了84%的分类准确率。进行更广泛推理的人做出了更准确的可重复性判断。一些原因被更准确的参与者更频繁地提及,比如“效应大小”和“声誉”(如研究领域的声誉)。也有一些证据表明统计素养与准确性之间存在关联。