• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度神经网络中类似人类的单眼深度偏差。

Human-like monocular depth biases in deep neural networks.

作者信息

Kubota Yuki, Fukiage Taiki

机构信息

Communication Science Laboratories, NTT, Inc., Kanagawa, Japan.

出版信息

PLoS Comput Biol. 2025 Aug 19;21(8):e1013020. doi: 10.1371/journal.pcbi.1013020. eCollection 2025 Aug.

DOI:10.1371/journal.pcbi.1013020
PMID:40828862
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12380331/
Abstract

Human depth perception from 2D images is systematically distorted, yet the nature of these distortions is not fully understood. By examining error patterns in depth estimation for both humans and deep neural networks (DNNs), which have shown remarkable abilities in monocular depth estimation, we can gain insights into constructing functional models of this human 3D vision and designing artificial models with improved interpretability. Here, we propose a comprehensive human-DNN comparison framework for a monocular depth judgment task. Using a novel human-annotated dataset of natural indoor scenes and a systematic analysis of absolute depth judgments, we investigate error patterns in both humans and DNNs. Employing exponential-affine fitting, we decompose depth estimation errors into depth compression, per-image affine transformations (including scaling, shearing, and translation), and residual errors. Our analysis reveals that human depth judgments exhibit systematic and consistent biases, including depth compression, a vertical bias (perceiving objects in the lower visual field as closer), and consistent per-image affine distortions across participants. Intriguingly, we find that DNNs with higher accuracy partially recapitulate these human biases, demonstrating greater similarity in affine parameters and residual error patterns. This suggests that these seemingly suboptimal human biases may reflect efficient, ecologically adapted strategies for depth inference from inherently ambiguous monocular images. However, while DNNs capture metric-level residual error patterns similar to humans, they fail to reproduce human-level accuracy in ordinal depth perception within the affine-invariant space. These findings underscore the importance of evaluating error patterns beyond raw accuracy, providing new insights into how humans and computational models resolve depth ambiguity. Our dataset and methodology provide a framework for evaluating the alignment between computational models and human perceptual biases, thereby advancing our understanding of visual space representation and guiding the development of models that more faithfully capture human depth perception.

摘要

人类从二维图像中进行深度感知时会出现系统性扭曲,但这些扭曲的本质尚未完全被理解。通过研究人类和深度神经网络(DNN)在深度估计中的误差模式(DNN在单目深度估计中展现出了卓越能力),我们能够深入了解构建人类三维视觉功能模型以及设计具有更高可解释性的人工模型。在此,我们针对单目深度判断任务提出了一个全面的人类与DNN比较框架。利用一个全新的自然室内场景人类标注数据集,并对绝对深度判断进行系统分析,我们研究了人类和DNN中的误差模式。采用指数仿射拟合,我们将深度估计误差分解为深度压缩、每张图像的仿射变换(包括缩放、剪切和平移)以及残余误差。我们的分析表明,人类深度判断呈现出系统性且一致的偏差,包括深度压缩、垂直偏差(将视野下方的物体感知为更近)以及参与者之间一致的每张图像仿射扭曲。有趣的是,我们发现具有更高准确性的DNN部分重现了这些人类偏差,在仿射参数和残余误差模式上表现出更大的相似性。这表明这些看似次优的人类偏差可能反映了从本质上模糊的单目图像进行深度推断的高效、生态适应性策略。然而,虽然DNN捕捉到了与人类相似的度量级残余误差模式,但它们在仿射不变空间内的顺序深度感知中未能重现人类级别的准确性。这些发现强调了超越原始准确性评估误差模式的重要性,为人类和计算模型如何解决深度模糊性提供了新的见解。我们的数据集和方法提供了一个评估计算模型与人类感知偏差之间一致性的框架,从而推进我们对视觉空间表征的理解,并指导更忠实地捕捉人类深度感知的模型的开发。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/0e5fa82bb1e2/pcbi.1013020.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/a0f66907ea18/pcbi.1013020.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/e5db2f7e9282/pcbi.1013020.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/7a0f66e03ab7/pcbi.1013020.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/6de9c38d8fbf/pcbi.1013020.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/3394fa149fc5/pcbi.1013020.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/f95b028ff306/pcbi.1013020.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/e64138876049/pcbi.1013020.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/ee2fee583874/pcbi.1013020.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/7f17d18e8feb/pcbi.1013020.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/e4e0269f37d4/pcbi.1013020.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/0dff674b9e78/pcbi.1013020.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/0e5fa82bb1e2/pcbi.1013020.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/a0f66907ea18/pcbi.1013020.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/e5db2f7e9282/pcbi.1013020.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/7a0f66e03ab7/pcbi.1013020.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/6de9c38d8fbf/pcbi.1013020.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/3394fa149fc5/pcbi.1013020.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/f95b028ff306/pcbi.1013020.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/e64138876049/pcbi.1013020.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/ee2fee583874/pcbi.1013020.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/7f17d18e8feb/pcbi.1013020.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/e4e0269f37d4/pcbi.1013020.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/0dff674b9e78/pcbi.1013020.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb84/12380331/0e5fa82bb1e2/pcbi.1013020.g012.jpg

相似文献

1
Human-like monocular depth biases in deep neural networks.深度神经网络中类似人类的单眼深度偏差。
PLoS Comput Biol. 2025 Aug 19;21(8):e1013020. doi: 10.1371/journal.pcbi.1013020. eCollection 2025 Aug.
2
Approximating Human-Level 3D Visual Inferences With Deep Neural Networks.利用深度神经网络逼近人类水平的3D视觉推理
Open Mind (Camb). 2025 Feb 16;9:305-324. doi: 10.1162/opmi_a_00189. eCollection 2025.
3
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
4
Short-Term Memory Impairment短期记忆障碍
5
The Lived Experience of Autistic Adults in Employment: A Systematic Search and Synthesis.成年自闭症患者的就业生活经历:系统检索与综述
Autism Adulthood. 2024 Dec 2;6(4):495-509. doi: 10.1089/aut.2022.0114. eCollection 2024 Dec.
6
Eliciting adverse effects data from participants in clinical trials.从临床试验参与者中获取不良反应数据。
Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.
7
Anterior Approach Total Ankle Arthroplasty with Patient-Specific Cut Guides.使用患者特异性截骨导向器的前路全踝关节置换术。
JBJS Essent Surg Tech. 2025 Aug 15;15(3). doi: 10.2106/JBJS.ST.23.00027. eCollection 2025 Jul-Sep.
8
Auditory-Perceptual Evaluation of Situationally-Bound Judgements of Listener Comfort for Postlaryngectomy Voice and Speech.喉切除术后嗓音和言语情境性听觉舒适度判断的听觉感知评估
Int J Lang Commun Disord. 2025 Sep-Oct;60(5):e70114. doi: 10.1111/1460-6984.70114.
9
The role of visual experience in haptic spatial perception: evidence from early blind, late blind, and sighted individuals.视觉经验在触觉空间感知中的作用:来自早期失明、晚期失明和有视力个体的证据。
Biol Sex Differ. 2025 Aug 19;16(1):64. doi: 10.1186/s13293-025-00747-y.
10
Distilling knowledge from graph neural networks trained on cell graphs to non-neural student models.从在细胞图上训练的图神经网络中提取知识,用于非神经学生模型。
Sci Rep. 2025 Aug 10;15(1):29274. doi: 10.1038/s41598-025-13697-7.

本文引用的文献

1
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation.BinsFormer:重新审视用于单目深度估计的自适应 bins
IEEE Trans Image Process. 2024;33:3964-3976. doi: 10.1109/TIP.2024.3416065. Epub 2024 Jun 28.
2
Toward a theory of perspective perception in pictures.迈向图像中透视感知的理论
J Vis. 2024 Apr 1;24(4):23. doi: 10.1167/jov.24.4.23.
3
Psychophysical measurement of perceived motion flow of naturalistic scenes.自然场景中感知运动流的心理物理学测量。
iScience. 2023 Oct 23;26(12):108307. doi: 10.1016/j.isci.2023.108307. eCollection 2023 Dec 15.
4
A Study on the Generality of Neural Network Structures for Monocular Depth Estimation.单目深度估计中神经网络结构通用性的研究
IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2224-2238. doi: 10.1109/TPAMI.2023.3332407. Epub 2024 Mar 6.
5
SC-DepthV3: Robust Self-Supervised Monocular Depth Estimation for Dynamic Scenes.SC-DepthV3:用于动态场景的稳健自监督单目深度估计
IEEE Trans Pattern Anal Mach Intell. 2024 Jan;46(1):497-508. doi: 10.1109/TPAMI.2023.3322549. Epub 2023 Dec 5.
6
Harmonizing the object recognition strategies of deep neural networks with humans.使深度神经网络的目标识别策略与人类相协调。
Adv Neural Inf Process Syst. 2022 Dec;35:9432-9446.
7
Unsupervised learning reveals interpretable latent representations for translucency perception.无监督学习揭示了透明度感知的可解释潜在表示。
PLoS Comput Biol. 2023 Feb 8;19(2):e1010878. doi: 10.1371/journal.pcbi.1010878. eCollection 2023 Feb.
8
New Approaches to 3D Vision.三维视觉的新方法。
Philos Trans R Soc Lond B Biol Sci. 2023 Jan 30;378(1869):20210443. doi: 10.1098/rstb.2021.0443. Epub 2022 Dec 13.
9
Look twice: A generalist computational model predicts return fixations across tasks and species.多看两眼:一个通才计算模型可以预测跨任务和物种的返回注视点。
PLoS Comput Biol. 2022 Nov 22;18(11):e1010654. doi: 10.1371/journal.pcbi.1010654. eCollection 2022 Nov.
10
Monocular Depth Estimation Using Deep Learning: A Review.基于深度学习的单目深度估计研究综述。
Sensors (Basel). 2022 Jul 18;22(14):5353. doi: 10.3390/s22145353.