Suppr超能文献

一个学习识别三维物体的网络。

A network that learns to recognize three-dimensional objects.

作者信息

Poggio T, Edelman S

机构信息

Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge 02139.

出版信息

Nature. 1990 Jan 18;343(6255):263-6. doi: 10.1038/343263a0.

Abstract

The visual recognition of three-dimensional (3-D) objects on the basis of their shape poses at least two difficult problems. First, there is the problem of variable illumination, which can be addressed by working with relatively stable features such as intensity edges rather than the raw intensity images. Second, there is the problem of the initially unknown pose of the object relative to the viewer. In one approach to this problem, a hypothesis is first made about the viewpoint, then the appearance of a model object from such a viewpoint is computed and compared with the actual image. Such recognition schemes generally employ 3-D models of objects, but the automatic learning of 3-D models is itself a difficult problem. To address this problem in computational vision, we have developed a scheme, based on the theory of approximation of multivariate functions, that learns from a small set of perspective views a function mapping any viewpoint to a standard view. A network equivalent to this scheme will thus 'recognize' the object on which it was trained from any viewpoint.

摘要

基于三维(3-D)物体的形状进行视觉识别至少存在两个难题。首先是光照变化问题,这可以通过使用相对稳定的特征(如强度边缘)而非原始强度图像来解决。其次是物体相对于观察者的初始姿态未知的问题。针对这个问题的一种方法是,首先对视角做出假设,然后计算从该视角观察到的模型物体的外观,并与实际图像进行比较。此类识别方案通常采用物体的三维模型,但三维模型的自动学习本身就是一个难题。为了解决计算视觉中的这个问题,我们基于多元函数逼近理论开发了一种方案,该方案从一小组透视图中学习一个将任何视角映射到标准视图的函数。因此,与该方案等效的网络将能够从任何视角“识别”其训练所基于的物体。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验