Suppr超能文献

基于多任务拓扑码本的大规模航空图像分类。

Large-Scale Aerial Image Categorization Using a Multitask Topological Codebook.

出版信息

IEEE Trans Cybern. 2016 Feb;46(2):535-45. doi: 10.1109/TCYB.2015.2408592. Epub 2015 Mar 16.

Abstract

Fast and accurately categorizing the millions of aerial images on Google Maps is a useful technique in pattern recognition. Existing methods cannot handle this task successfully due to two reasons: 1) the aerial images' topologies are the key feature to distinguish their categories, but they cannot be effectively encoded by a conventional visual codebook and 2) it is challenging to build a realtime image categorization system, as some geo-aware Apps update over 20 aerial images per second. To solve these problems, we propose an efficient aerial image categorization algorithm. It focuses on learning a discriminative topological codebook of aerial images under a multitask learning framework. The pipeline can be summarized as follows. We first construct a region adjacency graph (RAG) that describes the topology of each aerial image. Naturally, aerial image categorization can be formulated as RAG-to-RAG matching. According to graph theory, RAG-to-RAG matching is conducted by enumeratively comparing all their respective graphlets (i.e., small subgraphs). To alleviate the high time consumption, we propose to learn a codebook containing topologies jointly discriminative to multiple categories. The learned topological codebook guides the extraction of the discriminative graphlets. Finally, these graphlets are integrated into an AdaBoost model for predicting aerial image categories. Experimental results show that our approach is competitive to several existing recognition models. Furthermore, over 24 aerial images are processed per second, demonstrating that our approach is ready for real-world applications.

摘要

在模式识别中,快速准确地对谷歌地图上的数百万张航拍图像进行分类是一项非常有用的技术。现有的方法由于以下两个原因无法成功处理这项任务:1)航拍图像的拓扑结构是区分其类别的关键特征,但它们无法被传统的视觉代码本有效地编码;2)构建实时图像分类系统具有挑战性,因为一些地理感知应用程序每秒更新超过 20 张航拍图像。为了解决这些问题,我们提出了一种高效的航拍图像分类算法。它专注于在多任务学习框架下学习具有判别力的航拍图像拓扑代码本。该流水线可以概括为:首先构建描述每个航拍图像拓扑结构的区域邻接图(RAG)。自然地,航拍图像分类可以被公式化为 RAG 到 RAG 的匹配。根据图论,RAG 到 RAG 的匹配是通过枚举比较所有各自的图元(即小子图)来进行的。为了减轻高时间消耗,我们提出学习一个包含对多个类别具有共同判别力的拓扑代码本。学习到的拓扑代码本指导提取具有判别力的图元。最后,这些图元被集成到 AdaBoost 模型中,用于预测航拍图像类别。实验结果表明,我们的方法与几个现有的识别模型具有竞争力。此外,每秒处理超过 24 张航拍图像,表明我们的方法已经准备好用于实际应用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验