IEEE Trans Image Process. 2021;30:7803-7814. doi: 10.1109/TIP.2021.3079820. Epub 2021 Sep 14.
Intelligently understanding the sophisticated topological structures from aerial photographs is a useful technique in aerial image analysis. Conventional methods cannot fulfill this task due to the following challenges: 1) the topology number of an aerial photo increases exponentially with the topology size, which requires a fine-grained visual descriptor to discriminatively represent each topology; 2) identifying visually/semantically salient topologies within each aerial photo in a weakly-labeled context, owing to the unaffordable human resources required for pixel-level annotation; and 3) designing a cross-domain knowledge transferal module to augment aerial photo perception, since multi-resolution aerial photos are taken asynchronistically in practice. To handle the above problems, we propose a unified framework to understand aerial photo topologies, focusing on representing each aerial photo by a set of visually/semantically salient topologies based on human visual perception and further employing them for visual categorization. Specifically, we first extract multiple atomic regions from each aerial photo, and thereby graphlets are built to capture the each aerial photo topologically. Then, a weakly-supervised ranking algorithm selects a few semantically salient graphlets by seamlessly encoding multiple image-level attributes. Toward a visualizable and perception-aware framework, we construct gaze shifting path (GSP) by linking the top-ranking graphlets. Finally, we derive the deep GSP representation, and formulate a semi-supervised and cross-domain SVM to partition each aerial photo into multiple categories. The SVM utilizes the global composition from low-resolution counterparts to enhance the deep GSP features from high-resolution aerial photos which are partially-annotated. Extensive visualization results and categorization performance comparisons have demonstrated the competitiveness of our approach.
智能地理解航空照片中的复杂拓扑结构是航空图像分析中的一项有用技术。由于以下挑战,传统方法无法完成此任务:1)航空照片的拓扑数量随拓扑大小呈指数级增长,这需要精细的视觉描述符来区分表示每个拓扑;2)在弱标注的情况下,识别每个航空照片中的视觉/语义显著拓扑结构,因为像素级标注需要大量的人力资源;3)设计跨域知识迁移模块来增强航空照片感知,因为在实践中,多分辨率航空照片是异步拍摄的。为了解决上述问题,我们提出了一个统一的框架来理解航空照片拓扑结构,重点是基于人类视觉感知,用一组视觉/语义显著的拓扑结构来表示每张航空照片,并进一步将它们用于视觉分类。具体来说,我们首先从每张航空照片中提取多个原子区域,然后构建图元来拓扑地捕捉每个航空照片。然后,一个弱监督排序算法通过无缝编码多个图像级属性选择几个语义上显著的图元。为了构建一个可视化和感知感知的框架,我们通过链接顶级图元来构建注视转移路径(GSP)。最后,我们得出深 GSP 表示,并制定一个半监督和跨域 SVM 来将每张航空照片分为多个类别。SVM 利用来自低分辨率对应物的全局组合来增强来自部分标注的高分辨率航空照片的深 GSP 特征。广泛的可视化结果和分类性能比较表明了我们方法的竞争力。