IEEE Trans Pattern Anal Mach Intell. 2020 Jun;42(6):1408-1423. doi: 10.1109/TPAMI.2019.2894353. Epub 2019 Jan 22.
We investigate two crucial and closely-related aspects of CNNs for optical flow estimation: models and training. First, we design a compact but effective CNN model, called PWC-Net, according to simple and well-established principles: pyramidal processing, warping, and cost volume processing. PWC-Net is 17 times smaller in size, 2 times faster in inference, and 11 percent more accurate on Sintel final than the recent FlowNet2 model. It is the winning entry in the optical flow competition of the robust vision challenge. Next, we experimentally analyze the sources of our performance gains. In particular, we use the same training procedure for PWC-Net to retrain FlowNetC, a sub-network of FlowNet2. The retrained FlowNetC is 56 percent more accurate on Sintel final than the previously trained one and even 5 percent more accurate than the FlowNet2 model. We further improve the training procedure and increase the accuracy of PWC-Net on Sintel by 10 percent and on KITTI 2012 and 2015 by 20 percent. Our newly trained model parameters and training protocols are available on https://github.com/NVlabs/PWC-Net.
我们研究了用于光流估计的 CNN 的两个关键且密切相关的方面:模型和训练。首先,我们根据简单而成熟的原则设计了一个紧凑但有效的 CNN 模型,称为 PWC-Net:金字塔处理、变形和代价体处理。PWC-Net 的大小缩小了 17 倍,推理速度提高了 2 倍,在 Sintel 最终测试中比最近的 FlowNet2 模型准确 11%。它是鲁棒视觉挑战赛光流竞赛的获胜者。接下来,我们通过实验分析了我们性能提升的来源。特别是,我们使用相同的训练过程对 FlowNet2 的子网 FlowNetC 进行再训练。再训练的 FlowNetC 在 Sintel 最终测试中的准确性比之前训练的提高了 56%,甚至比 FlowNet2 模型还要准确 5%。我们进一步改进了训练过程,使 PWC-Net 在 Sintel 上的准确性提高了 10%,在 KITTI 2012 和 2015 上的准确性提高了 20%。我们新训练的模型参数和训练协议可在 https://github.com/NVlabs/PWC-Net 上获得。