Suppr超能文献

预测新冠疫情预印本中社会与行为科学论断的可重复性。

Predicting the replicability of social and behavioural science claims in COVID-19 preprints.

作者信息

Marcoci Alexandru, Wilkinson David P, Vercammen Ans, Wintle Bonnie C, Abatayo Anna Lou, Baskin Ernest, Berkman Henk, Buchanan Erin M, Capitán Sara, Capitán Tabaré, Chan Ginny, Cheng Kent Jason G, Coupé Tom, Dryhurst Sarah, Duan Jianhua, Edlund John E, Errington Timothy M, Fedor Anna, Fidler Fiona, Field James G, Fox Nicholas, Fraser Hannah, Freeman Alexandra L J, Hanea Anca, Holzmeister Felix, Hong Sanghyun, Huggins Raquel, Huntington-Klein Nick, Johannesson Magnus, Jones Angela M, Kapoor Hansika, Kerr John, Kline Struhl Melissa, Kołczyńska Marta, Liu Yang, Loomas Zachary, Luis Brianna, Méndez Esteban, Miske Olivia, Mody Fallon, Nast Carolin, Nosek Brian A, Simon Parsons E, Pfeiffer Thomas, Reed W Robert, Roozenbeek Jon, Schlyfestone Alexa R, Schneider Claudia R, Soh Andrew, Song Zhongchen, Tagat Anirudh, Tutor Melba, Tyner Andrew H, Urbanska Karolina, van der Linden Sander

机构信息

Centre for the Study of Existential Risk, University of Cambridge, Cambridge, UK.

School of Politics and International Relations, University of Nottingham, Nottingham, UK.

出版信息

Nat Hum Behav. 2025 Feb;9(2):287-304. doi: 10.1038/s41562-024-01961-1. Epub 2024 Dec 20.

Abstract

Replications are important for assessing the reliability of published findings. However, they are costly, and it is infeasible to replicate everything. Accurate, fast, lower-cost alternatives such as eliciting predictions could accelerate assessment for rapid policy implementation in a crisis and help guide a more efficient allocation of scarce replication resources. We elicited judgements from participants on 100 claims from preprints about an emerging area of research (COVID-19 pandemic) using an interactive structured elicitation protocol, and we conducted 29 new high-powered replications. After interacting with their peers, participant groups with lower task expertise ('beginners') updated their estimates and confidence in their judgements significantly more than groups with greater task expertise ('experienced'). For experienced individuals, the average accuracy was 0.57 (95% CI: [0.53, 0.61]) after interaction, and they correctly classified 61% of claims; beginners' average accuracy was 0.58 (95% CI: [0.54, 0.62]), correctly classifying 69% of claims. The difference in accuracy between groups was not statistically significant and their judgements on the full set of claims were correlated (r(98) = 0.48, P < 0.001). These results suggest that both beginners and more-experienced participants using a structured process have some ability to make better-than-chance predictions about the reliability of 'fast science' under conditions of high uncertainty. However, given the importance of such assessments for making evidence-based critical decisions in a crisis, more research is required to understand who the right experts in forecasting replicability are and how their judgements ought to be elicited.

摘要

重复验证对于评估已发表研究结果的可靠性很重要。然而,重复验证成本高昂,对所有内容进行重复验证并不可行。诸如引出预测等准确、快速、低成本的替代方法可以加快评估,以便在危机中迅速实施政策,并有助于指导更有效地分配稀缺的重复验证资源。我们使用交互式结构化引出协议,让参与者对来自预印本的100条关于一个新兴研究领域(新冠疫情)的断言进行判断,并且我们进行了29项新的高功效重复验证。在与同行互动后,任务专业知识较少的参与者群体(“新手”)比任务专业知识较多的群体(“经验丰富者”)在更新估计值和对自己判断的信心方面有更显著的变化。对于经验丰富的个体,互动后的平均准确率为0.57(95%置信区间:[0.53, 0.61]),他们正确分类了61%的断言;新手的平均准确率为0.58(95%置信区间:[0.54, 0.62]),正确分类了69%的断言。两组之间的准确率差异无统计学意义,并且他们对整套断言的判断具有相关性(r(98) = 0.48,P < 0.001)。这些结果表明,使用结构化流程的新手和经验更丰富的参与者都有一定能力在高度不确定的情况下对“快速科学”的可靠性做出优于随机猜测的预测。然而,鉴于此类评估对于在危机中做出基于证据的关键决策的重要性,需要更多研究来了解谁是预测可重复性的合适专家以及应该如何引出他们的判断。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3b/11860236/558d6e766e8a/41562_2024_1961_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验