Suppr超能文献

弱监督多模态核分类航拍图像。

Weakly Supervised Multimodal Kernel for Categorizing Aerial Photographs.

出版信息

IEEE Trans Image Process. 2017 Aug;26(8):3748-3758. doi: 10.1109/TIP.2016.2639438. Epub 2016 Dec 14.

Abstract

Accurately distinguishing aerial photographs from different categories is a promising technique in computer vision. It can facilitate a series of applications, such as video surveillance and vehicle navigation. In this paper, a new image kernel is proposed for effectively recognizing aerial photographs. The key is to encode high-level semantic cues into local image patches in a weakly supervised way, and integrate multimodal visual features using a newly developed hashing algorithm. The flowchart can be elaborated as follows. Given an aerial photo, we first extract a number of graphlets to describe its topological structure. For each graphlet, we utilize color and texture to capture its appearance, and a weakly supervised algorithm to capture its semantics. Thereafter, aerial photo categorization can be naturally formulated as graphlet-to-graphlet matching. As the number of graphlets from each aerial photo is huge, to accelerate matching, we present a hashing algorithm to seamlessly fuze the multiple visual features into binary codes. Finally, an image kernel is calculated by fast matching the binary codes corresponding to each graphlet. And a multi-class SVM is learned for aerial photo categorization. We demonstrate the advantage of our proposed model by comparing it with state-of-the-art image descriptors. Moreover, an in-depth study of the descriptiveness of the hash-based graphlet is presented.

摘要

准确区分不同类别的航空照片是计算机视觉中的一项很有前途的技术。它可以促进一系列应用,如视频监控和车辆导航。在本文中,提出了一种新的图像核方法,用于有效地识别航空照片。关键是采用弱监督的方式将高层语义线索编码到局部图像补丁中,并使用新开发的哈希算法集成多模态视觉特征。流程图可以详细说明如下。给定一张航空照片,我们首先提取一些图元来描述其拓扑结构。对于每个图元,我们利用颜色和纹理来捕捉其外观,并利用弱监督算法来捕捉其语义。此后,航空照片分类可以自然地表述为图元到图元的匹配。由于每张航空照片的图元数量巨大,为了加速匹配,我们提出了一种哈希算法,将多个视觉特征无缝融合到二进制代码中。最后,通过快速匹配每个图元对应的二进制代码来计算图像核。并学习多类 SVM 进行航空照片分类。通过与最先进的图像描述符进行比较,我们展示了我们提出的模型的优势。此外,还对基于哈希的图元的描述能力进行了深入研究。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验