IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):1015-1028. doi: 10.1109/TPAMI.2017.2701380. Epub 2017 May 4.
Super resolving a low-resolution video, namely video super-resolution (SR), is usually handled by either single-image SR or multi-frame SR. Single-Image SR deals with each video frame independently, and ignores intrinsic temporal dependency of video frames which actually plays a very important role in video SR. Multi-Frame SR generally extracts motion information, e.g., optical flow, to model the temporal dependency, but often shows high computational cost. Considering that recurrent neural networks (RNNs) can model long-term temporal dependency of video sequences well, we propose a fully convolutional RNN named bidirectional recurrent convolutional network for efficient multi-frame SR. Different from vanilla RNNs, 1) the commonly-used full feedforward and recurrent connections are replaced with weight-sharing convolutional connections. So they can greatly reduce the large number of network parameters and well model the temporal dependency in a finer level, i.e., patch-based rather than frame-based, and 2) connections from input layers at previous timesteps to the current hidden layer are added by 3D feedforward convolutions, which aim to capture discriminate spatio-temporal patterns for short-term fast-varying motions in local adjacent frames. Due to the cheap convolutional operations, our model has a low computational complexity and runs orders of magnitude faster than other multi-frame SR methods. With the powerful temporal dependency modeling, our model can super resolve videos with complex motions and achieve well performance.
超分辨率处理低分辨率视频,即视频超分辨率 (SR),通常可以通过单图像 SR 或多帧 SR 来实现。单图像 SR 独立处理每一帧视频,忽略了视频帧内在的时间依赖性,而这在视频 SR 中实际上起着非常重要的作用。多帧 SR 通常提取运动信息,例如光流,以建模时间依赖性,但通常显示出较高的计算成本。考虑到递归神经网络 (RNN) 可以很好地建模视频序列的长期时间依赖性,我们提出了一种名为双向递归卷积网络的全卷积 RNN,用于高效的多帧 SR。与普通 RNN 不同,1)常用的全前馈和递归连接被具有权重共享的卷积连接所取代。因此,它们可以大大减少大量的网络参数,并以更精细的层次,即基于补丁而不是基于帧的方式,很好地建模时间依赖性,以及 2)通过 3D 前馈卷积将来自上一时步输入层的连接添加到当前隐藏层,旨在捕获用于局部相邻帧中短期快速变化运动的有区分力的时空模式。由于廉价的卷积操作,我们的模型具有低计算复杂度,并且比其他多帧 SR 方法快几个数量级。通过强大的时间依赖性建模,我们的模型可以对具有复杂运动的视频进行超分辨率处理,并取得良好的性能。