School of Instrument Science and Engineering, Southeast University, Nanjing 210096, China.
Microsoft Research Asia, Beijing 100089, China.
Sensors (Basel). 2021 Apr 2;21(7):2464. doi: 10.3390/s21072464.
Multiple-camera systems can expand coverage and mitigate occlusion problems. However, temporal synchronization remains a problem for budget cameras and capture devices. We propose an out-of-the-box framework to temporally synchronize multiple cameras using semantic human pose estimation from the videos. Human pose predictions are obtained with an out-of-the-shelf pose estimator for each camera. Our method firstly calibrates each pair of cameras by minimizing an energy function related to epipolar distances. We also propose a simple yet effective multiple-person association algorithm across cameras and a score-regularized energy function for improved performance. Secondly, we integrate the synchronized camera pairs into a graph and derive the optimal temporal displacement configuration for the multiple-camera system. We evaluate our method on four public benchmark datasets and demonstrate robust sub-frame synchronization accuracy on all of them.
多摄像机系统可以扩大覆盖范围并减轻遮挡问题。但是,对于预算有限的摄像机和捕获设备来说,时间同步仍然是一个问题。我们提出了一种即插即用的框架,使用视频中的语义人体姿态估计来实现多个摄像机的时间同步。对于每个摄像机,我们使用现成的姿态估计器来获取人体姿态预测。我们的方法首先通过最小化与视差距离相关的能量函数来校准每对摄像机。我们还提出了一种简单而有效的跨摄像机多人关联算法,以及一个分数正则化的能量函数,以提高性能。其次,我们将同步的摄像机对集成到一个图中,并为多摄像机系统推导出最佳的时间位移配置。我们在四个公共基准数据集上评估了我们的方法,并在所有数据集上都展示了稳健的子帧同步精度。