IEEE J Biomed Health Inform. 2024 Mar;28(3):1460-1471. doi: 10.1109/JBHI.2023.3345486. Epub 2024 Mar 6.
Video-based heart and respiratory rate measurements using facial videos are more useful and user-friendly than traditional contact-based sensors. However, most of the current deep learning approaches require ground-truth pulse and respiratory waves for model training, which are expensive to collect. In this paper, we propose CalibrationPhys, a self-supervised video-based heart and respiratory rate measurement method that calibrates between multiple cameras. CalibrationPhys trains deep learning models without supervised labels by using facial videos captured simultaneously by multiple cameras. Contrastive learning is performed so that the pulse and respiratory waves predicted from the synchronized videos using multiple cameras are positive and those from different videos are negative. CalibrationPhys also improves the robustness of the models by means of a data augmentation technique and successfully leverages a pre-trained model for a particular camera. Experimental results utilizing two datasets demonstrate that CalibrationPhys outperforms state-of-the-art heart and respiratory rate measurement methods. Since we optimize camera-specific models using only videos from multiple cameras, our approach makes it easy to use arbitrary cameras for heart and respiratory rate measurements.
使用面部视频进行基于视频的心率和呼吸率测量比传统的基于接触的传感器更有用且更易用。然而,目前大多数深度学习方法都需要用于模型训练的脉搏和呼吸波的真实数据,而这些数据的收集成本很高。在本文中,我们提出了 CalibrationPhys,这是一种基于自我监督的视频的心率和呼吸率测量方法,可在多个摄像机之间进行校准。CalibrationPhys 通过使用多台摄像机同时捕获的面部视频来训练深度学习模型,而无需使用监督标签。通过对比学习,使用多台摄像机同步视频预测的脉搏和呼吸波为正,而来自不同视频的预测结果为负。CalibrationPhys 还通过数据增强技术提高了模型的鲁棒性,并成功地利用了针对特定摄像机的预训练模型。利用两个数据集的实验结果表明,CalibrationPhys 优于最先进的心率和呼吸率测量方法。由于我们仅使用多台摄像机的视频来优化特定于摄像机的模型,因此我们的方法可以方便地使用任意摄像机进行心率和呼吸率测量。