Liu Xiao-Yang, Huang Qifan, Han Xiaochen, Wu Bo, Kong Linghe, Walid Anwar, Wang Xiaodong
IEEE Trans Neural Netw Learn Syst. 2023 Jun 2;PP. doi: 10.1109/TNNLS.2023.3266998.
Snapshot compressive imaging (SCI) cameras compress high-speed videos or hyperspectral images into measurement frames. However, decoding the data frames from measurement frames is compute-intensive. Existing state-of-the-art decoding algorithms suffer from low decoding quality or heavy running time or both, which are not practical for real-time applications. In this article, we exploit the powerful learning ability of deep neural networks (DNN) and propose a novel tensor fast iterative shrinkage-thresholding algorithm net (Tensor FISTA-Net) as a real-time decoder for SCI cameras. Since SCI cameras have an accurate physical model, we can trade training time for the decoding time by generating abundant synthetic data and training a decoder on the cloud. Tensor FISTA-Net not only learns a sparse representation of the frames through convolution layers but also reduces the decoding time and memory consumption significantly through tensor operations, which makes Tensor FISTA-Net an appropriate approach for a real-time decoder. Our proposed Tensor FISTA-Net obtains an average PSNR improvement of 0.79-2.84 dB (video images) and 2.61-4.43 dB (hyperspectral images) over the state-of-the-art algorithms, along with more clear and detailed visual results on real SCI datasets, Hammer and Wheel, respectively. Our Tensor FISTA-Net reaches 45 frames per second in video datasets and 70 frames per second in hyperspectral datasets, meeting the real-time requirement. Besides, the trained model occupies only a 12 -MB memory footprint, making it applicable to real-time Internet of Things (IoT) applications.
快照压缩成像(SCI)相机将高速视频或高光谱图像压缩成测量帧。然而,从测量帧中解码数据帧计算量很大。现有的最先进解码算法存在解码质量低、运行时间长或两者兼而有之的问题,这对于实时应用来说并不实用。在本文中,我们利用深度神经网络(DNN)强大的学习能力,提出了一种新颖的张量快速迭代收缩阈值算法网络(Tensor FISTA-Net)作为SCI相机的实时解码器。由于SCI相机有精确的物理模型,我们可以通过生成大量合成数据并在云端训练解码器,用训练时间换取解码时间。Tensor FISTA-Net不仅通过卷积层学习帧的稀疏表示,还通过张量运算显著减少了解码时间和内存消耗,这使得Tensor FISTA-Net成为实时解码器的合适方法。我们提出的Tensor FISTA-Net在视频图像上比最先进算法的平均峰值信噪比提高了0.79 - 2.84 dB,在高光谱图像上提高了2.61 - 4.43 dB,并且在真实的SCI数据集Hammer和Wheel上分别有更清晰、更详细的视觉效果。我们的Tensor FISTA-Net在视频数据集中达到每秒45帧,在高光谱数据集中达到每秒70帧,满足了实时要求。此外,训练后的模型仅占用12兆字节的内存,使其适用于实时物联网(IoT)应用。