Qiao Huijiao, Qian Weiqi, Hu Haifeng, Huang Xingbo, Li Jiequn
Department of Surveying Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China.
Taiyuan Research Institute of China Coal Technology & Engineering Group, Taiyuan 030006, China.
Sensors (Basel). 2024 Aug 11;24(16):5205. doi: 10.3390/s24165205.
Data and reports indicate an increasing frequency and intensity of natural disasters worldwide. Buildings play a crucial role in disaster responses and damage assessments, aiding in planning rescue efforts and evaluating losses. Despite advances in applying deep learning to building extraction, challenges remain in handling complex natural disaster scenes and reducing reliance on labeled datasets. Recent advances in satellite video are opening a new avenue for efficient and accurate building extraction research. By thoroughly mining the characteristics of disaster video data, this work provides a new semantic segmentation model for accurate and efficient building extraction based on a limited number of training data, which consists of two parts: the prediction module and the automatic correction module. The prediction module, based on a base encoder-decoder structure, initially extracts buildings using a limited amount of training data that are obtained instantly. Then, the automatic correction module takes the output of the prediction module as input, constructs a criterion for identifying pixels with erroneous semantic information, and uses optical flow values to extract the accurate corresponding semantic information on the corrected frame. The experimental results demonstrate that the proposed method outperforms other methods in accuracy and computational complexity in complicated natural disaster scenes.
数据和报告表明,全球自然灾害的频率和强度在不断增加。建筑物在灾害应对和损失评估中起着至关重要的作用,有助于规划救援工作和评估损失。尽管在将深度学习应用于建筑物提取方面取得了进展,但在处理复杂的自然灾害场景和减少对标记数据集的依赖方面仍存在挑战。卫星视频的最新进展为高效准确的建筑物提取研究开辟了一条新途径。通过深入挖掘灾害视频数据的特征,这项工作基于有限数量的训练数据,提供了一种用于准确高效建筑物提取的新语义分割模型,该模型由两部分组成:预测模块和自动校正模块。预测模块基于基本的编码器 - 解码器结构,最初使用即时获得的有限数量的训练数据提取建筑物。然后,自动校正模块将预测模块的输出作为输入,构建一个识别具有错误语义信息像素的标准,并使用光流值在校正帧上提取准确的相应语义信息。实验结果表明,在复杂的自然灾害场景中,该方法在准确性和计算复杂度方面优于其他方法。