College of Communication Engineering, Army Engineering University of PLA, Nanjing 210001, China.
College of Command and Control, Army Engineering University of PLA, Nanjing 210001, China.
Sensors (Basel). 2022 Sep 21;22(19):7172. doi: 10.3390/s22197172.
Video compression sensing can use a few measurements to obtain the original video by reconstruction algorithms. There is a natural correlation between video frames, and how to exploit this feature becomes the key to improving the reconstruction quality. More and more deep learning-based video compression sensing (VCS) methods are proposed. Some methods overlook interframe information, so they fail to achieve satisfactory reconstruction quality. Some use complex network structures to exploit the interframe information, but it increases the parameters and makes the training process more complicated. To overcome the limitations of existing VCS methods, we propose an efficient end-to-end VCS network, which integrates the measurement and reconstruction into one whole framework. In the measurement part, we train a measurement matrix rather than a pre-prepared random matrix, which fits the video reconstruction task better. An unfolded LSTM network is utilized in the reconstruction part, deeply fusing the intra- and interframe spatial-temporal information. The proposed method has higher reconstruction accuracy than existing video compression sensing networks and even performs well at measurement ratios as low as 0.01.
视频压缩感知可以通过重建算法使用少量测量值来获得原始视频。视频帧之间存在自然的相关性,如何利用这一特征成为提高重建质量的关键。越来越多基于深度学习的视频压缩感知 (VCS) 方法被提出。有些方法忽略了帧间信息,因此无法达到令人满意的重建质量。有些方法使用复杂的网络结构来利用帧间信息,但这会增加参数并使训练过程更加复杂。为了克服现有 VCS 方法的局限性,我们提出了一种高效的端到端 VCS 网络,它将测量和重建集成到一个整体框架中。在测量部分,我们训练测量矩阵而不是预先准备的随机矩阵,这更适合视频重建任务。在重建部分,我们使用展开的 LSTM 网络,深度融合了帧内和帧间的时空信息。与现有的视频压缩感知网络相比,所提出的方法具有更高的重建精度,甚至在测量率低至 0.01 时也能表现良好。