层次化目标识别模型的进化优化

Schneider Georg, Wersing Heiko, Sendhoff Bernhard, Körner Edgar

Audi Electronics Venture GmbH, D-85045 Ingolstadt, Germany.

IEEE Trans Syst Man Cybern B Cybern. 2005 Jun;35(3):426-37. doi: 10.1109/tsmcb.2005.846649.

A major problem in designing artificial neural networks is the proper choice of the network architecture. Especially for vision networks classifying three-dimensional (3-D) objects this problem is very challenging, as these networks are necessarily large and therefore the search space for defining the needed networks is of a very high dimensionality. This strongly increases the chances of obtaining only suboptimal structures from standard optimization algorithms. We tackle this problem in two ways. First, we use biologically inspired hierarchical vision models to narrow the space of possible architectures and to reduce the dimensionality of the search space. Second, we employ evolutionary optimization techniques to determine optimal features and nonlinearities of the visual hierarchy. Here, we especially focus on higher order complex features in higher hierarchical stages. We compare two different approaches to perform an evolutionary optimization of these features. In the first setting, we directly code the features into the genome. In the second setting, in analogy to an ontogenetical development process, we suggest the new method of an indirect coding of the features via an unsupervised learning process, which is embedded into the evolutionary optimization. In both cases the processing nonlinearities are encoded directly into the genome and are thus subject to optimization. The fitness of the individuals for the evolutionary selection process is computed by measuring the network classification performance on a benchmark image database. Here, we use a nearest-neighbor classification approach, based on the hierarchical feature output. We compare the found solutions with respect to their ability to generalize. We differentiate between a first- and a second-order generalization. The first-order generalization denotes how well the vision system, after evolutionary optimization of the features and nonlinearities using a database A, can classify previously unseen test views of objects from this database A. As second-order generalization, we denote the ability of the vision system to perform classification on a database B using the features and nonlinearities optimized on database A. We show that the direct feature coding approach leads to networks with a better first-order generalization, whereas the second-order generalization is on an equally high level for both direct and indirect coding. We also compare the second-order generalization results with other state-of-the-art recognition systems and show that both approaches lead to optimized recognition systems, which are highly competitive with recent recognition algorithms.

设计人工神经网络时的一个主要问题是网络架构的恰当选择。尤其是对于对三维（3-D）物体进行分类的视觉网络而言，这个问题极具挑战性，因为这些网络必然规模庞大，所以定义所需网络的搜索空间维度非常高。这极大地增加了从标准优化算法中仅获得次优结构的可能性。我们通过两种方式解决这个问题。首先，我们使用受生物启发的分层视觉模型来缩小可能架构的空间并降低搜索空间的维度。其次，我们采用进化优化技术来确定视觉层次结构的最优特征和非线性。在此，我们特别关注更高层次阶段的高阶复杂特征。我们比较两种不同的方法来对这些特征进行进化优化。在第一种设置中，我们将特征直接编码到基因组中。在第二种设置中，类似于个体发育过程，我们提出了一种通过无监督学习过程对特征进行间接编码的新方法，该过程嵌入到进化优化中。在这两种情况下，处理非线性都直接编码到基因组中，因此会受到优化。通过在基准图像数据库上测量网络分类性能来计算个体在进化选择过程中的适应度。在此，我们基于分层特征输出使用最近邻分类方法。我们比较找到的解决方案在泛化能力方面的表现。我们区分一阶泛化和二阶泛化。一阶泛化表示在使用数据库A对特征和非线性进行进化优化后，视觉系统对该数据库A中物体的先前未见测试视图进行分类的能力有多好。作为二阶泛化，我们指的是视觉系统使用在数据库A上优化的特征和非线性对数据库B进行分类的能力。我们表明直接特征编码方法会产生具有更好一阶泛化能力的网络，而对于直接编码和间接编码，二阶泛化处于同等高水平。我们还将二阶泛化结果与其他当前最先进的识别系统进行比较，并表明这两种方法都能产生优化的识别系统，与最近的识别算法相比具有很强的竞争力。

相似文献

Evolutionary optimization of a hierarchical object recognition model.

IEEE Trans Syst Man Cybern B Cybern. 2005 Jun;35(3):426-37. doi: 10.1109/tsmcb.2005.846649.

Robust rotation-invariant texture classification using a model based approach.

IEEE Trans Image Process. 2004 Jun;13(6):782-91. doi: 10.1109/tip.2003.822607.

Affine invariant features from the trace transform.

IEEE Trans Pattern Anal Mach Intell. 2004 Jan;26(1):30-44. doi: 10.1109/tpami.2004.1261077.

Automatic construction of active appearance models as an image coding problem.

IEEE Trans Pattern Anal Mach Intell. 2004 Oct;26(10):1380-4. doi: 10.1109/TPAMI.2004.77.

Bounded blending for function-based shape modeling.

IEEE Comput Graph Appl. 2005 Mar-Apr;25(2):36-45. doi: 10.1109/mcg.2005.37.

Simple method for high-performance digit recognition based on sparse coding.

IEEE Trans Neural Netw. 2008 Nov;19(11):1985-9. doi: 10.1109/TNN.2008.2005830.

Genetically optimized fuzzy decision trees.

IEEE Trans Syst Man Cybern B Cybern. 2005 Jun;35(3):633-41. doi: 10.1109/tsmcb.2005.843975.

Indexing hierarchical structures using graph spectra.

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1125-40. doi: 10.1109/TPAMI.2005.142.

Optimal linear representations of images for object recognition.

IEEE Trans Pattern Anal Mach Intell. 2004 May;26(5):662-6. doi: 10.1109/TPAMI.2004.1273986.

A kernel autoassociator approach to pattern classification.

IEEE Trans Syst Man Cybern B Cybern. 2005 Jun;35(3):593-606. doi: 10.1109/tsmcb.2005.843980.

引用本文的文献

How can selection of biologically inspired features improve the performance of a robust object recognition model?

PLoS One. 2012;7(2):e32357. doi: 10.1371/journal.pone.0032357. Epub 2012 Feb 27.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Evolutionary optimization of a hierarchical object recognition model.

IEEE Trans Syst Man Cybern B Cybern. 2005 Jun;35(3):426-37. doi: 10.1109/tsmcb.2005.846649.

Robust rotation-invariant texture classification using a model based approach.

IEEE Trans Image Process. 2004 Jun;13(6):782-91. doi: 10.1109/tip.2003.822607.

Affine invariant features from the trace transform.

IEEE Trans Pattern Anal Mach Intell. 2004 Jan;26(1):30-44. doi: 10.1109/tpami.2004.1261077.

Automatic construction of active appearance models as an image coding problem.

IEEE Trans Pattern Anal Mach Intell. 2004 Oct;26(10):1380-4. doi: 10.1109/TPAMI.2004.77.

Bounded blending for function-based shape modeling.

IEEE Comput Graph Appl. 2005 Mar-Apr;25(2):36-45. doi: 10.1109/mcg.2005.37.

Simple method for high-performance digit recognition based on sparse coding.

IEEE Trans Neural Netw. 2008 Nov;19(11):1985-9. doi: 10.1109/TNN.2008.2005830.

Genetically optimized fuzzy decision trees.

IEEE Trans Syst Man Cybern B Cybern. 2005 Jun;35(3):633-41. doi: 10.1109/tsmcb.2005.843975.

Indexing hierarchical structures using graph spectra.

IEEE Trans Pattern Anal Mach Intell. 2005 Jul;27(7):1125-40. doi: 10.1109/TPAMI.2005.142.

Optimal linear representations of images for object recognition.

IEEE Trans Pattern Anal Mach Intell. 2004 May;26(5):662-6. doi: 10.1109/TPAMI.2004.1273986.

A kernel autoassociator approach to pattern classification.

IEEE Trans Syst Man Cybern B Cybern. 2005 Jun;35(3):593-606. doi: 10.1109/tsmcb.2005.843980.

引用本文的文献

How can selection of biologically inspired features improve the performance of a robust object recognition model?

PLoS One. 2012;7(2):e32357. doi: 10.1371/journal.pone.0032357. Epub 2012 Feb 27.

Evolutionary optimization of a hierarchical object recognition model.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献