Feng Jie, Liang Yuping, Zhang Xiangrong, Zhang Junpeng, Jiao Licheng
IEEE Trans Image Process. 2023;32:1788-1801. doi: 10.1109/TIP.2023.3251026. Epub 2023 Mar 14.
In satellite videos, moving vehicles are extremely small-sized and densely clustered in vast scenes. Anchor-free detectors offer great potential by predicting the keypoints and boundaries of objects directly. However, for dense small-sized vehicles, most anchor-free detectors miss the dense objects without considering the density distribution. Furthermore, weak appearance features and massive interference in the satellite videos limit the application of anchor-free detectors. To address these problems, a novel semantic-embedded density adaptive network (SDANet) is proposed. In SDANet, the cluster-proposals, including a variable number of objects, and centers are generated parallelly through pixel-wise prediction. Then, a novel density matching algorithm is designed to obtain each object via partitioning the cluster-proposals and matching the corresponding centers hierarchically and recursively. Meanwhile, the isolated cluster-proposals and centers are suppressed. In SDANet, the road is segmented in vast scenes and its semantic features are embedded into the network by weakly supervised learning, which guides the detector to emphasize the regions of interest. By this way, SDANet reduces the false detection caused by massive interference. To alleviate the lack of appearance information on small-sized vehicles, a customized bi-directional conv-RNN module extracts the temporal information from consecutive input frames by aligning the disturbed background. The experimental results on Jilin-1 and SkySat satellite videos demonstrate the effectiveness of SDANet, especially for dense objects.
在卫星视频中,行驶的车辆尺寸极小,且在广阔场景中密集分布。无锚检测器通过直接预测物体的关键点和边界展现出巨大潜力。然而,对于密集的小型车辆,大多数无锚检测器会遗漏密集物体,而未考虑密度分布。此外,卫星视频中微弱的外观特征和大量干扰限制了无锚检测器的应用。为解决这些问题,提出了一种新颖的语义嵌入密度自适应网络(SDANet)。在SDANet中,通过逐像素预测并行生成包含可变数量物体的聚类提议和中心。然后,设计了一种新颖的密度匹配算法,通过对聚类提议进行划分并分层递归地匹配相应中心来获取每个物体。同时,抑制孤立的聚类提议和中心。在SDANet中,在广阔场景中分割道路,并通过弱监督学习将其语义特征嵌入网络,这引导检测器强调感兴趣区域。通过这种方式,SDANet减少了由大量干扰导致的误检测。为缓解小型车辆外观信息不足的问题,一个定制的双向conv-RNN模块通过对齐受干扰的背景从连续输入帧中提取时间信息。在吉林一号和SkySat卫星视频上的实验结果证明了SDANet的有效性,尤其是对于密集物体。