Ke Jianwei, Watras Alex J, Kim Jae-Jun, Liu Hewei, Jiang Hongrui, Hu Yu Hen
The Department of Electrical and Computer Engineering at the University of Wisconsin-Madison, 1415 Engineering Drive, Madison, WI 53706, USA.
J Signal Process Syst. 2022 Mar;94(3):329-343. doi: 10.1007/s11265-021-01729-0. Epub 2022 Jan 27.
A real-time 3D visualization (RT3DV) system using a multiview RGB camera array is presented. RT3DV can process multiple synchronized video streams to produce a stereo video of a dynamic scene from a chosen view angle. Its design objective is to facilitate 3D visualization at the video frame rate with good viewing quality. To facilitate 3D vision, RT3DV estimates and updates a surface mesh model formed directly from a set of sparse key points. The 3D coordinates of these key points are estimated from matching 2D key points across multiview video streams with the aid of epipolar geometry and trifocal tensor. To capture the scene dynamics, 2D key points in individual video streams are tracked between successive frames. We implemented a proof of concept RT3DV system tasked to process five synchronous video streams acquired by an RGB camera array. It achieves a processing speed of 44 milliseconds per frame and a peak signal to noise ratio (PSNR) of 15.9 dB from a viewpoint coinciding with a reference view. As a comparison, an image-based MVS algorithm utilizing a dense point cloud model and frame by frame feature detection and matching will require 7 seconds to render a frame and yield a reference view PSNR of 16.3 dB.
本文提出了一种使用多视图RGB相机阵列的实时3D可视化(RT3DV)系统。RT3DV能够处理多个同步视频流,以从选定视角生成动态场景的立体视频。其设计目标是在视频帧率下实现具有良好观看质量的3D可视化。为便于实现3D视觉,RT3DV估计并更新直接由一组稀疏关键点形成的表面网格模型。这些关键点的3D坐标借助对极几何和三焦点张量,通过跨多视图视频流匹配2D关键点来估计。为捕捉场景动态,在连续帧之间跟踪各个视频流中的2D关键点。我们实现了一个概念验证的RT3DV系统,其任务是处理由RGB相机阵列采集的五个同步视频流。从与参考视图一致的视角来看,它实现了每帧44毫秒的处理速度以及15.9 dB的峰值信噪比(PSNR)。作为对比,一种利用密集点云模型以及逐帧特征检测和匹配的基于图像的多视图立体(MVS)算法渲染一帧需要7秒,并且参考视图的PSNR为16.3 dB。