Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China.
College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China.
Sensors (Basel). 2021 Feb 16;21(4):1375. doi: 10.3390/s21041375.
The specific building is of great significance in smart city planning, management practices, or even military use. However, traditional classification or target identification methods are difficult to distinguish different type of buildings from remote sensing images, because the characteristics of the environmental landscape around the buildings (like the pixels of the road and parking area) are complex, and it is difficult to define them with simple rules. Convolution neural networks (CNNs) have a strong capacity to mine information from the spatial context and have been used in many tasks of image processing. Here, we developed a novel CNN model named YOLO-S-CIOU, which was improved based on YOLOv3 for specific building detection in two aspects: (1) module Darknet53 in YOLOv3 was replaced with SRXnet (constructed by superimposing multiple SE-ResNeXt) to significantly improve the feature learning ability of YOLO-S-CIOU while maintaining the similar complexity as YOLOv3; (2) Complete-IoU Loss (CIoU Loss) was used to obtain a better regression for the bounding box. We took the gas station as an example. The experimental results on the self-made gas station dataset (GS dataset) showed YOLO-S-CIOU achieved an average precision (AP) of 97.62%, an F1 score of 97.50%, and had 59,065,366 parameters. Compared with YOLOv3, YOLO-S-CIOU reduced the parameters' number by 2,510,977 (about 4%) and improved the AP by 2.23% and the F1 score by 0.5%. Moreover, in gas stations detection in Tumshuk City and Yanti City, the recall (R) and precision (P) of YOLO-S-CIOU were 50% and 40% higher than those of YOLOv3, respectively. It showed that our proposed network had stronger robustness and higher detection ability in remote sensing image detection of different regions.
该特定建筑物在智慧城市规划、管理实践甚至军事用途中具有重要意义。然而,传统的分类或目标识别方法难以从遥感图像中区分不同类型的建筑物,因为建筑物周围环境景观(如道路和停车场的像素)的特征复杂,难以用简单的规则来定义。卷积神经网络(CNNs)具有从空间上下文挖掘信息的强大能力,并已被用于图像处理的许多任务。在这里,我们开发了一种名为 YOLO-S-CIOU 的新型 CNN 模型,该模型在两个方面对 YOLOv3 进行了改进,用于特定建筑物的检测:(1)YOLOv3 中的 Darknet53 模块被替换为 SRXnet(由多个 SE-ResNeXt 叠加构建),显著提高了 YOLO-S-CIOU 的特征学习能力,同时保持与 YOLOv3 相似的复杂度;(2)完全交并比损失(CIoU Loss)用于获得更好的边界框回归。我们以加油站为例,在自制的加油站数据集(GS 数据集)上的实验结果表明,YOLO-S-CIOU 的平均精度(AP)为 97.62%,F1 得分为 97.50%,参数数量为 59,065,366。与 YOLOv3 相比,YOLO-S-CIOU 减少了 2,510,977 个参数(约 4%),AP 提高了 2.23%,F1 得分提高了 0.5%。此外,在 Tumshuk 市和 Yanti 市的加油站检测中,YOLO-S-CIOU 的召回率(R)和准确率(P)分别比 YOLOv3 高 50%和 40%。这表明我们提出的网络在不同地区的遥感图像检测中具有更强的鲁棒性和更高的检测能力。