Suppr超能文献

比较在完美和非最佳的目击证人辨认程序下,人类对目击证人证词的评估与机器学习分类器的评估。

Comparing human evaluations of eyewitness statements to a machine learning classifier under pristine and suboptimal lineup administration procedures.

机构信息

Department of Psychology, New Mexico State University, United States.

Department of Psychological & Brain Sciences, Washington University in Saint Louis, United States.

出版信息

Cognition. 2024 Oct;251:105876. doi: 10.1016/j.cognition.2024.105876. Epub 2024 Jul 14.

Abstract

Recent work highlights the ability of verbal machine learning classifiers to distinguish between accurate and inaccurate recognition memory decisions (Dobbins, 2022; Dobbins & Kantner, 2019; Seale-Carlisle, Grabman, & Dodson, 2022). Given the surge of interest in these modeling techniques, there is an urgent need to investigate verbal classifiers' limitations - particularly in applied contexts such as when police collect eyewitness's confidence statements. We find that confirmatory feedback (e.g., "This study now has a total of 87 participants, 84 of them made the same decision as you!") weakens the relationship between identification accuracy and verbal classifier scores to a similar degree as mock witnesses' numeric confidence judgments (Experiment 1). Crucially, for the first time, we compare the discriminative value of verbal classifier scores to the ratings of human evaluators who assessed the identical verbal confidence statements (Experiment 2). Our results suggest that human evaluators outperform the classifier when mock witnesses received no feedback; however, the classifier matches (or exceeds) the performance of human evaluators when mock witnesses received confirmatory feedback. Providing lineup information to human evaluators resulted in a worse ability to distinguish between correct and filler identifications, suggesting that this particular information may encourage the use of inappropriate heuristics when rendering accuracy judgments. Overall, these results suggest that the utility of verbal classifiers may be enhanced when contextual effects (e.g., lineup presence) impair human estimates of others' performance, but that translating witnesses' statements into classifier scores will not fix the problems of an improperly conducted lineup procedure.

摘要

最近的研究强调了言语机器学习分类器区分准确和不准确识别记忆决策的能力(Dobbins,2022;Dobbins & Kantner,2019;Seale-Carlisle、Grabman 和 Dodson,2022)。鉴于人们对这些建模技术的兴趣激增,迫切需要研究言语分类器的局限性 - 特别是在应用环境中,例如警察收集目击者的置信度陈述时。我们发现确认性反馈(例如,“本研究现在共有 87 名参与者,其中 84 名与您做出了相同的决定!”)会削弱识别准确性与言语分类器分数之间的关系,其程度与模拟证人的数字置信度判断相似(实验 1)。至关重要的是,这是首次将言语分类器分数的判别价值与评估相同言语置信陈述的人类评估者的评分进行比较(实验 2)。我们的研究结果表明,在模拟证人未收到反馈的情况下,人类评估者的表现优于分类器;然而,当模拟证人收到确认性反馈时,分类器的表现与人类评估者的表现相匹配(或超过)。向人类评估者提供阵容信息会导致区分正确和填充识别的能力更差,这表明当做出准确性判断时,这种特定信息可能会鼓励使用不适当的启发式方法。总体而言,这些结果表明,当上下文效应(例如阵容存在)削弱人类对他人表现的估计时,言语分类器的效用可能会提高,但将证人的陈述转换为分类器分数并不能解决不当进行阵容程序的问题。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验