Cadena Pablo Rodrigo Gantier, Qian Yeqiang, Wang Chunxiang, Yang Ming
IEEE Trans Image Process. 2021;30:2488-2500. doi: 10.1109/TIP.2021.3052070. Epub 2021 Feb 1.
Event-based cameras have several advantages over traditional cameras that shoot videos in frames. Event cameras have a high temporal resolution, high dynamic range, and almost non-existence of blurriness. The data that is produced by event sensors forms a chain of events when a change in brightness is reported in each pixel. This feature makes it difficult to directly apply existing algorithms and take advantage of the event camera data. Due to the developments in neural networks, important advances were made in event-based image reconstruction. Even though these neural networks achieve precise reconstructions while preserving most of the properties of the event cameras, there is still an initialization time that needs to have the highest possible quality in the reconstructed frames. In this work, we present the SPADE-E2VID neural network model that improves the quality of early frames in an event-based reconstructed video, as well as the overall contrast. The SPADE-E2VID model improves the quality of the first reconstructed frames by 15.87% for MSE error, 4.15% for SSIM, and 2.5% in LPIPS. In addition, the SPADE layer in our model allows training our model to reconstruct videos without a temporal loss function. Another advantage of our model is that it has a faster training time. In a many-to-one training style, we avoid running the loss function at each step, executing the loss function at the end of each loop only once. In the present work, we also carried out experiments with event cameras that do not have polarity data. Our model produces quality video reconstructions with non-polarity events in HD resolution (1200 × 800). The Video, the code, and the datasets will be available at: https://github.com/RodrigoGantier/SPADE_E2VID.
与传统的逐帧拍摄视频的相机相比,基于事件的相机有几个优点。事件相机具有高时间分辨率、高动态范围,并且几乎不存在模糊现象。当每个像素报告亮度变化时,事件传感器产生的数据形成一系列事件。这一特性使得难以直接应用现有算法并利用事件相机数据。由于神经网络的发展,基于事件的图像重建取得了重要进展。尽管这些神经网络在保留事件相机的大部分特性的同时实现了精确重建,但在重建帧中仍存在一个初始化时间,需要尽可能高的质量。在这项工作中,我们提出了SPADE-E2VID神经网络模型,该模型提高了基于事件的重建视频中早期帧的质量以及整体对比度。对于MSE误差,SPADE-E2VID模型将第一个重建帧的质量提高了15.87%,对于SSIM提高了4.15%,在LPIPS中提高了2.5%。此外,我们模型中的SPADE层允许在没有时间损失函数的情况下训练我们的模型来重建视频。我们模型的另一个优点是训练时间更快。在多对一的训练方式中,我们避免在每一步运行损失函数,仅在每个循环结束时执行一次损失函数。在目前的工作中,我们还对没有极性数据的事件相机进行了实验。我们的模型以高清分辨率(1200×800)对非极性事件产生高质量的视频重建。视频、代码和数据集可在以下网址获取:https://github.com/RodrigoGantier/SPADE_E2VID 。