Suppr超能文献

比较人类和神经网络在目标识别中速度与准确性的权衡

Benchmarking the speed-accuracy tradeoff in object recognition by humans and neural networks.

作者信息

Subramanian Ajay, Price Sara, Kumbhar Omkar, Sizikova Elena, Majaj Najib J, Pelli Denis G

机构信息

Department of Psychology, New York University, New York, NY, USA.

Center for Data Science, New York University, New York, NY, USA.

出版信息

J Vis. 2025 Jan 2;25(1):4. doi: 10.1167/jov.25.1.4.

Abstract

Active object recognition, fundamental to tasks like reading and driving, relies on the ability to make time-sensitive decisions. People exhibit a flexible tradeoff between speed and accuracy, a crucial human skill. However, current computational models struggle to incorporate time. To address this gap, we present the first dataset (with 148 observers) exploring the speed-accuracy tradeoff (SAT) in ImageNet object recognition. Participants performed a 16-way ImageNet categorization task where their responses counted only if they occurred near the time of a fixed-delay beep. Each block of trials allowed one reaction time. As expected, human accuracy increases with reaction time. We compare human performance with that of dynamic neural networks that adapt their computation to the available inference time. Time is a scarce resource for human object recognition, and finding an appropriate analog in neural networks is challenging. Networks can repeat operations by using layers, recurrent cycles, or early exits. We use the repetition count as a network's analog for time. In our analysis, the number of layers, recurrent cycles, and early exits correlates strongly with floating-point operations, making them suitable time analogs. Comparing networks and humans on SAT-fit error, category-wise correlation, and SAT-curve steepness, we find cascaded dynamic neural networks most promising in modeling human speed and accuracy. Surprisingly, convolutional recurrent networks, typically favored in human object recognition modeling, perform the worst on our benchmark.

摘要

主动目标识别是阅读和驾驶等任务的基础,它依赖于做出对时间敏感决策的能力。人们在速度和准确性之间展现出灵活的权衡,这是一项至关重要的人类技能。然而,当前的计算模型难以纳入时间因素。为了弥补这一差距,我们展示了第一个数据集(有148名观察者),用于探索在ImageNet目标识别中的速度-准确性权衡(SAT)。参与者执行了一项16分类的ImageNet分类任务,只有在固定延迟的蜂鸣声响起时附近做出的反应才会被计算在内。每个试验块允许一个反应时间。正如预期的那样,人类的准确性随着反应时间的增加而提高。我们将人类的表现与动态神经网络的表现进行比较,这些动态神经网络会根据可用的推理时间来调整其计算。时间是人类目标识别中的一种稀缺资源,在神经网络中找到合适的类似物具有挑战性。网络可以通过使用层、循环周期或早期退出机制来重复操作。我们将重复次数用作网络对时间的类似物。在我们的分析中,层数、循环周期数和早期退出机制与浮点运算密切相关,这使得它们成为合适的时间类似物。在SAT拟合误差、类别相关性和SAT曲线陡度方面比较网络和人类,我们发现级联动态神经网络在模拟人类速度和准确性方面最具潜力。令人惊讶的是,通常在人类目标识别建模中受到青睐的卷积循环网络在我们的基准测试中表现最差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebd0/11706240/a775c4b6b6db/jovi-25-1-4-f001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验