Suppr超能文献

用于边界检测的视觉线索之间的系统比较。

A systematic comparison between visual cues for boundary detection.

作者信息

Mély David A, Kim Junkyung, McGill Mason, Guo Yuliang, Serre Thomas

机构信息

Brown University, Providence, RI 02912, United States; Department of Cognitive, Linguistic and Psychological Sciences, United States.

Brown University, Providence, RI 02912, United States; Department of Engineering, United States.

出版信息

Vision Res. 2016 Mar;120:93-107. doi: 10.1016/j.visres.2015.11.007. Epub 2016 Mar 2.

Abstract

The detection of object boundaries is a critical first step for many visual processing tasks. Multiple cues (we consider luminance, color, motion and binocular disparity) available in the early visual system may signal object boundaries but little is known about their relative diagnosticity and how to optimally combine them for boundary detection. This study thus aims at understanding how early visual processes inform boundary detection in natural scenes. We collected color binocular video sequences of natural scenes to construct a video database. Each scene was annotated with two full sets of ground-truth contours (one set limited to object boundaries and another set which included all edges). We implemented an integrated computational model of early vision that spans all considered cues, and then assessed their diagnosticity by training machine learning classifiers on individual channels. Color and luminance were found to be most diagnostic while stereo and motion were least. Combining all cues yielded a significant improvement in accuracy beyond that of any cue in isolation. Furthermore, the accuracy of individual cues was found to be a poor predictor of their unique contribution for the combination. This result suggested a complex interaction between cues, which we further quantified using regularization techniques. Our systematic assessment of the accuracy of early vision models for boundary detection together with the resulting annotated video dataset should provide a useful benchmark towards the development of higher-level models of visual processing.

摘要

对许多视觉处理任务而言,检测物体边界是关键的第一步。早期视觉系统中可用的多种线索(我们考虑亮度、颜色、运动和双眼视差)可能标志着物体边界,但对于它们的相对诊断价值以及如何为边界检测对其进行最佳组合,我们却知之甚少。因此,本研究旨在了解早期视觉过程如何为自然场景中的边界检测提供信息。我们收集了自然场景的彩色双目视频序列以构建一个视频数据库。每个场景都用两组完整的真实轮廓进行了标注(一组仅限于物体边界,另一组包括所有边缘)。我们实现了一个跨越所有考虑线索的早期视觉综合计算模型,然后通过在各个通道上训练机器学习分类器来评估它们的诊断价值。结果发现颜色和亮度的诊断价值最高,而立体视觉和运动的诊断价值最低。组合所有线索所产生的准确率显著提高,超过了任何单独线索的准确率。此外,发现单个线索的准确率并不能很好地预测它们在组合中的独特贡献。这一结果表明线索之间存在复杂的相互作用,我们使用正则化技术进一步对其进行了量化。我们对用于边界检测的早期视觉模型的准确率进行的系统评估以及由此产生的带注释视频数据集,应为高级视觉处理模型的开发提供有用的基准。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验