Suppr超能文献

通过自监督学习实现基于面部视频的远程生理测量

Facial Video-Based Remote Physiological Measurement via Self-Supervised Learning.

作者信息

Yue Zijie, Shi Miaojing, Ding Shuai

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13844-13859. doi: 10.1109/TPAMI.2023.3298650. Epub 2023 Oct 3.

Abstract

Facial video-based remote physiological measurement aims to estimate remote photoplethysmography (rPPG) signals from human facial videos and then measure multiple vital signs (e.g., heart rate, respiration frequency) from rPPG signals. Recent approaches achieve it by training deep neural networks, which normally require abundant facial videos and synchronously recorded photoplethysmography (PPG) signals for supervision. However, the collection of these annotated corpora is not easy in practice. In this paper, we introduce a novel frequency-inspired self-supervised framework that learns to estimate rPPG signals from facial videos without the need of ground truth PPG signals. Given a video sample, we first augment it into multiple positive/negative samples which contain similar/dissimilar signal frequencies to the original one. Specifically, positive samples are generated using spatial augmentation; negative samples are generated via a learnable frequency augmentation module, which performs non-linear signal frequency transformation on the input without excessively changing its visual appearance. Next, we introduce a local rPPG expert aggregation module to estimate rPPG signals from augmented samples. It encodes complementary pulsation information from different face regions and aggregates them into one rPPG prediction. Finally, we propose a series of frequency-inspired losses, i.e., frequency contrastive loss, frequency ratio consistency loss, and cross-video frequency agreement loss, for the optimization of estimated rPPG signals from multiple augmented video samples. We conduct rPPG-based heart rate, heart rate variability, and respiration frequency estimation on five standard benchmarks. The experimental results demonstrate that our method improves the state of the art by a large margin.

摘要

基于面部视频的远程生理测量旨在从人类面部视频中估计远程光电容积脉搏波描记法(rPPG)信号,然后从rPPG信号中测量多个生命体征(如心率、呼吸频率)。最近的方法是通过训练深度神经网络来实现这一目标,这通常需要大量面部视频和同步记录的光电容积脉搏波描记法(PPG)信号进行监督。然而,在实际中收集这些带注释的语料库并不容易。在本文中,我们引入了一种新颖的受频率启发的自监督框架,该框架无需真实的PPG信号即可从面部视频中学习估计rPPG信号。给定一个视频样本,我们首先将其增强为多个正/负样本,这些样本包含与原始样本相似/不同的信号频率。具体来说,正样本通过空间增强生成;负样本通过一个可学习的频率增强模块生成,该模块对输入进行非线性信号频率变换,而不过度改变其视觉外观。接下来,我们引入一个局部rPPG专家聚合模块,从增强后的样本中估计rPPG信号。它对来自不同面部区域的互补脉动信息进行编码,并将它们聚合为一个rPPG预测。最后,我们提出了一系列受频率启发的损失函数,即频率对比损失、频率比一致性损失和跨视频频率一致性损失,用于优化从多个增强视频样本中估计的rPPG信号。我们在五个标准基准上进行了基于rPPG的心率、心率变异性和呼吸频率估计。实验结果表明,我们的方法在很大程度上改进了现有技术。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验