Chen Zhihong, Chen Xueyun, Ye Chenghong, Wu Shaojie, Wu Xiang
School of Electrical Engineering, Guangxi University, Nanning, 530004, China.
Sci Rep. 2025 Aug 20;15(1):30608. doi: 10.1038/s41598-025-16386-7.
With the rapid advancement of Unmanned Aerial Vehicle applications, vision-based 3D scene reconstruction has demonstrated significant value in fields such as remote sensing and target detection. However, scenes captured by UAVs are often large-scale, sparsely viewed, and complex. These characteristics pose significant challenges for neural radiance field (NeRF)-based reconstruction. Specifically, the reconstructed images may suffer from blurred edges and unclear textures. This is primarily due to the lack of edge information and the fact that certain objects appear in only a few images, leading to incomplete reconstructions. To address these issues, this paper proposes a hybrid image encoder that combines convolutional neural networks and Transformer to extract image features to assist NeRF in scene reconstruction and generate new perspective images. Furthermore, we extend the NeRF architecture by introducing an additional branch that estimates uncertainty values associated with transient regions in the scene, enabling the model to suppress dynamic content and focus on static structure reconstruction. To further improve synthesis quality, we also refine the loss function used during training to better guide network optimization. Experimental results on a custom UAV aerial imagery dataset demonstrate the effectiveness of our method in accurately reconstructing and rendering UAV-captured scenes.
随着无人机应用的迅速发展,基于视觉的三维场景重建在遥感和目标检测等领域展现出了重要价值。然而,无人机拍摄的场景通常规模较大、视角稀疏且复杂。这些特性给基于神经辐射场(NeRF)的重建带来了重大挑战。具体而言,重建图像可能会出现边缘模糊和纹理不清晰的问题。这主要是由于缺乏边缘信息以及某些物体仅出现在少数图像中,导致重建不完整。为了解决这些问题,本文提出了一种混合图像编码器,它结合了卷积神经网络和Transformer来提取图像特征,以协助NeRF进行场景重建并生成新的视角图像。此外,我们通过引入一个额外的分支来扩展NeRF架构,该分支估计与场景中瞬态区域相关的不确定性值,使模型能够抑制动态内容并专注于静态结构重建。为了进一步提高合成质量,我们还对训练期间使用的损失函数进行了优化,以更好地指导网络优化。在自定义无人机航空影像数据集上的实验结果证明了我们的方法在准确重建和渲染无人机拍摄场景方面的有效性。