School of Information Science and Technology, Northwest University, Xi'an, People's Republic of China.
Center for Machine Vision and Signal Analysis, University of Oulu, Oulu.
Physiol Meas. 2022 Nov 3;43(11). doi: 10.1088/1361-6579/ac98f1.
. Efficient non-contact heart rate (HR) measurement from facial video has received much attention in health monitoring. Past methods relied on prior knowledge and an unproven hypothesis to extract remote photoplethysmography (rPPG) signals, e.g. manually designed regions of interest (ROIs) and the skin reflection model.. This paper presents a short-time end to end HR estimation framework based on facial features and temporal relationships of video frames. In the proposed method, a deep 3D multi-scale network with cross-layer residual structure is designed to construct an autoencoder and extract robust rPPG features. Then, a spatial-temporal fusion mechanism is proposed to help the network focus on features related to rPPG signals. Both shallow and fused 3D spatial-temporal features are distilled to suppress redundant information in the complex environment. Finally, a data augmentation strategy is presented to solve the problem of uneven distribution of HR in existing datasets.. The experimental results on four face-rPPG datasets show that our method overperforms the state-of-the-art methods and requires fewer video frames. Compared with the previous best results, the proposed method improves the root mean square error (RMSE) by 5.9%, 3.4% and 21.4% on the OBF dataset (intra-test), COHFACE dataset (intra-test) and UBFC dataset (cross-test), respectively.. Our method achieves good results on diverse datasets (i.e. highly compressed video, low-resolution and illumination variation), demonstrating that our method can extract stable rPPG signals in short time.
. 高效的非接触式心率 (HR) 测量技术已经在健康监测领域受到了广泛关注。过去的方法依赖于先验知识和未经证实的假设来提取远程光体积描记术 (rPPG) 信号,例如手动设计的感兴趣区域 (ROI) 和皮肤反射模型。本文提出了一种基于面部特征和视频帧时间关系的端到端 HR 估计框架。在提出的方法中,设计了一个具有跨层残差结构的深度 3D 多尺度网络,用于构建自编码器并提取稳健的 rPPG 特征。然后,提出了一种时空融合机制,帮助网络关注与 rPPG 信号相关的特征。浅的和融合的 3D 时空特征都被提取出来,以抑制复杂环境中的冗余信息。最后,提出了一种数据增强策略来解决现有数据集中心率分布不均匀的问题。在四个面部 rPPG 数据集上的实验结果表明,我们的方法优于最新方法,并且需要更少的视频帧。与之前的最佳结果相比,我们的方法在 OBF 数据集(内部测试)、COHFACE 数据集(内部测试)和 UBFC 数据集(交叉测试)上分别将均方根误差 (RMSE) 提高了 5.9%、3.4%和 21.4%。我们的方法在不同的数据集(即高度压缩的视频、低分辨率和光照变化)上都取得了良好的效果,这表明我们的方法可以在短时间内提取稳定的 rPPG 信号。