用于仿生扑翼飞机的多模态融合图像稳定算法

Multimodal Fusion Image Stabilization Algorithm for Bio-Inspired Flapping-Wing Aircraft.

作者信息

Wang Zhikai, Wang Sen, Hu Yiwen, Zhou Yangfan, Li Na, Zhang Xiaofeng

机构信息

College of Information Engineering, Henan University of Science and Technology, Luoyang 471023, China.

Henan Key Laboratory of Robot and Intelligent System, Henan University of Science and Technology, Luoyang 471023, China.

出版信息

Biomimetics (Basel). 2025 Jul 7;10(7):448. doi: 10.3390/biomimetics10070448.

DOI:10.3390/biomimetics10070448

PMID:40710261

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12292680/

Abstract

This paper presents FWStab, a specialized video stabilization dataset tailored for flapping-wing platforms. The dataset encompasses five typical flight scenarios, featuring 48 video clips with intense dynamic jitter. The corresponding Inertial Measurement Unit (IMU) sensor data are synchronously collected, which jointly provide reliable support for multimodal modeling. Based on this, to address the issue of poor image acquisition quality due to severe vibrations in aerial vehicles, this paper proposes a multi-modal signal fusion video stabilization framework. This framework effectively integrates image features and inertial sensor features to predict smooth and stable camera poses. During the video stabilization process, the true camera motion originally estimated based on sensors is warped to the smooth trajectory predicted by the network, thereby optimizing the inter-frame stability. This approach maintains the global rigidity of scene motion, avoids visual artifacts caused by traditional dense optical flow-based spatiotemporal warping, and rectifies rolling shutter-induced distortions. Furthermore, the network is trained in an unsupervised manner by leveraging a joint loss function that integrates camera pose smoothness and optical flow residuals. When coupled with a multi-stage training strategy, this framework demonstrates remarkable stabilization adaptability across a wide range of scenarios. The entire framework employs Long Short-Term Memory (LSTM) to model the temporal characteristics of camera trajectories, enabling high-precision prediction of smooth trajectories.

摘要

本文介绍了FWStab，这是一个专门为扑翼平台量身定制的视频稳定数据集。该数据集包含五种典型飞行场景，有48个具有强烈动态抖动的视频片段。同时收集了相应的惯性测量单元（IMU）传感器数据，为多模态建模提供了可靠支持。基于此，为解决飞行器剧烈振动导致图像采集质量差的问题，本文提出了一种多模态信号融合视频稳定框架。该框架有效地整合了图像特征和惯性传感器特征，以预测平滑稳定的相机姿态。在视频稳定过程中，最初基于传感器估计的真实相机运动被扭曲到网络预测的平滑轨迹上，从而优化帧间稳定性。这种方法保持了场景运动的全局刚性，避免了传统基于密集光流的时空扭曲所引起的视觉伪影，并校正了卷帘快门引起的失真。此外，通过利用整合相机姿态平滑度和光流残差的联合损失函数，以无监督方式对网络进行训练。当与多阶段训练策略相结合时，该框架在广泛的场景中展现出显著的稳定适应性。整个框架采用长短期记忆（LSTM）对相机轨迹的时间特征进行建模，能够高精度地预测平滑轨迹。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于仿生扑翼飞机的多模态融合图像稳定算法

Multimodal Fusion Image Stabilization Algorithm for Bio-Inspired Flapping-Wing Aircraft.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

用于仿生扑翼飞机的多模态融合图像稳定算法

Multimodal Fusion Image Stabilization Algorithm for Bio-Inspired Flapping-Wing Aircraft.

作者信息

机构信息

出版信息

相似文献

本文引用的文献