Department of Systems and Computer Engineering, Carleton University, Ottawa, ON K1S 5B6, Canada.
Flight Research Laboratory, National Research Council of Canada (NRC), Ottawa, ON K1A 0R6, Canada.
Sensors (Basel). 2024 Oct 2;24(19):6386. doi: 10.3390/s24196386.
Thermal videos provide a privacy-preserving yet information-rich data source for remote health monitoring, especially for respiration rate (RR) estimation. This paper introduces an end-to-end deep learning approach to RR measurement using thermal video data. A detection transformer (DeTr) first finds the subject's facial region of interest in each thermal frame. A respiratory signal is estimated from a dynamically cropped thermal video using 3D convolutional neural networks and bi-directional long short-term memory stages. To account for the expected phase shift between the respiration measured using a respiratory effort belt vs. a facial video, a novel loss function based on negative maximum cross-correlation and absolute frequency peak difference was introduced. Thermal recordings from 22 subjects, with simultaneous gold standard respiratory effort measurements, were studied while sitting or standing, both with and without a face mask. The RR estimation results showed that our proposed method outperformed existing models, achieving an error of only 1.6 breaths per minute across the four conditions. The proposed method sets a new State-of-the-Art for RR estimation accuracy, while still permitting real-time RR estimation.
热视频为远程健康监测提供了一种隐私保护但信息丰富的数据来源,特别是在呼吸率 (RR) 估计方面。本文介绍了一种使用热视频数据进行 RR 测量的端到端深度学习方法。检测转换器 (DeTr) 首先在每个热帧中找到对象的面部感兴趣区域。使用 3D 卷积神经网络和双向长短期记忆阶段,从动态裁剪的热视频中估计呼吸信号。为了说明使用呼吸努力带测量的呼吸与面部视频之间预期的相位差,引入了一种基于负最大互相关和绝对频率峰值差的新损失函数。研究了 22 名受试者的热记录,同时进行了黄金标准呼吸努力测量,这些受试者在坐着或站着时、有和没有面罩时都进行了测量。RR 估计结果表明,我们提出的方法优于现有模型,在四种情况下的误差仅为每分钟 1.6 次呼吸。该方法在允许实时 RR 估计的同时,将 RR 估计精度设定为新的行业标准。