IEEE Trans Image Process. 2021;30:4099-4113. doi: 10.1109/TIP.2021.3069296. Epub 2021 Apr 8.
Deep unfolding methods design deep neural networks as learned variations of optimization algorithms through the unrolling of their iterations. These networks have been shown to achieve faster convergence and higher accuracy than the original optimization methods. In this line of research, this paper presents novel interpretable deep recurrent neural networks (RNNs), designed by the unfolding of iterative algorithms that solve the task of sequential signal reconstruction (in particular, video reconstruction). The proposed networks are designed by accounting that video frames' patches have a sparse representation and the temporal difference between consecutive representations is also sparse. Specifically, we design an interpretable deep RNN (coined reweighted-RNN) by unrolling the iterations of a proximal method that solves a reweighted version of the l - l minimization problem. Due to the underlying minimization model, our reweighted-RNN has a different thresholding function (alias, different activation function) for each hidden unit in each layer. In this way, it has higher network expressivity than existing deep unfolding RNN models. We also present the derivative l - l -RNN model, which is obtained by unfolding a proximal method for the l - l minimization problem. We apply the proposed interpretable RNNs to the task of video frame reconstruction from low-dimensional measurements, that is, sequential video frame reconstruction. The experimental results on various datasets demonstrate that the proposed deep RNNs outperform various RNN models.
深度展开方法通过迭代展开将深度神经网络设计为优化算法的学习变体。这些网络已经被证明比原始优化方法具有更快的收敛速度和更高的准确性。在这一研究领域中,本文提出了新颖的可解释深度递归神经网络(RNN),这些网络通过展开求解序列信号重建任务(特别是视频重建)的迭代算法来设计。所提出的网络通过考虑到视频帧的补丁具有稀疏表示,并且连续表示之间的时间差也是稀疏的,从而进行设计。具体来说,我们通过展开求解重加权 l - l 最小化问题的近端方法的迭代来设计可解释的深度 RNN(重加权-RNN)。由于基础的最小化模型,我们的重加权-RNN 在每层的每个隐藏单元中具有不同的阈值函数(别名,不同的激活函数)。通过这种方式,它比现有的深度展开 RNN 模型具有更高的网络表达能力。我们还提出了导数 l - l -RNN 模型,它是通过展开求解 l - l 最小化问题的近端方法获得的。我们将所提出的可解释 RNN 应用于从低维测量中重建视频帧的任务,即顺序视频帧重建。在各种数据集上的实验结果表明,所提出的深度 RNN 优于各种 RNN 模型。