Department of Computer Science, University of North Carolina, Charlotte, NC 28223, USA.
IEEE Trans Image Process. 2011 Mar;20(3):837-54. doi: 10.1109/TIP.2010.2073476. Epub 2010 Sep 7.
In this paper, a structured max-margin learning algorithm is developed to achieve more effective training of a large number of inter-related classifiers for multilabel image annotation application. To leverage multilabel images for classifier training, each multilabel image is partitioned into a set of image instances (image regions or image patches) and an automatic instance label identification algorithm is developed to assign multiple labels (which are given at the image level) to the most relevant image instances. A K-way min-max cut algorithm is developed for automatic instance clustering and kernel weight determination, where multiple base kernels are seamlessly combined to address the issue of huge intra-concept visual diversity more effectively. Second, a visual concept network is constructed for characterizing the inter-concept visual similarity contexts more precisely in the high-dimensional multimodal feature space. The visual concept network is used to determine the inter-related learning tasks directly in the feature space rather than in the label space because feature space is the common space for classifier training and image classification. Third, a parallel computing platform is developed to achieve more effective learning of a large number of inter-related classifiers over the visual concept network. A structured max-margin learning algorithm is developed by incorporating the visual concept network, max-margin Markov networks and multitask learning to address the issue of huge inter-concept visual similarity more effectively. By leveraging the inter-concept visual similarity contexts for inter-related classifier training, our structured max-margin learning algorithm can significantly enhance the discrimination power of the inter-related classifiers. Our experiments have also obtained very positive results for a large number of object classes and image concepts.
本文提出了一种结构化的最大间隔学习算法,用于实现大量相关分类器的有效训练,以应用于多标签图像标注。为了利用多标签图像进行分类器训练,将每个多标签图像划分为一组图像实例(图像区域或图像补丁),并开发了一种自动实例标签识别算法,以将多个标签(在图像级别给出)分配给最相关的图像实例。开发了一种 K -way 最小最大切割算法,用于自动实例聚类和核权重确定,其中多个基础核无缝组合,以更有效地解决巨大的内部概念视觉多样性问题。其次,构建了一个视觉概念网络,以更精确地描述高维多模态特征空间中的概念间视觉相似性上下文。视觉概念网络用于直接在特征空间中而不是在标签空间中确定相关学习任务,因为特征空间是分类器训练和图像分类的公共空间。第三,开发了一个并行计算平台,以在视觉概念网络上更有效地学习大量相关分类器。通过结合视觉概念网络、最大间隔马尔可夫网络和多任务学习,开发了一种结构化的最大间隔学习算法,以更有效地解决巨大的概念间视觉相似性问题。通过利用概念间的视觉相似性上下文进行相关分类器训练,我们的结构化最大间隔学习算法可以显著提高相关分类器的辨别能力。我们的实验也获得了大量对象类和图像概念的非常积极的结果。