Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK.
Sensors (Basel). 2020 May 1;20(9):2576. doi: 10.3390/s20092576.
The use of visual sensors for monitoring people in their living environments is critical in processing more accurate health measurements, but their use is undermined by the issue of privacy. Silhouettes, generated from RGB video, can help towards alleviating the issue of privacy to some considerable degree. However, the use of silhouettes would make it rather complex to discriminate between different subjects, preventing a subject-tailored analysis of the data within a free-living, multi-occupancy home. This limitation can be overcome with a strategic fusion of sensors that involves wearable accelerometer devices, which can be used in conjunction with the silhouette video data, to match video clips to a specific patient being monitored. The proposed method simultaneously solves the problem of Person ReID using silhouettes and enables home monitoring systems to employ sensor fusion techniques for data analysis. We develop a multimodal deep-learning detection framework that maps short video clips and accelerations into a latent space where the Euclidean distance can be measured to match video and acceleration streams. We train our method on the SPHERE Calorie Dataset, for which we show an average area under the ROC curve of 76.3% and an assignment accuracy of 77.4%. In addition, we propose a novel triplet loss for which we demonstrate improving performances and convergence speed.
在处理更准确的健康测量数据方面,使用视觉传感器监测人们的生活环境至关重要,但隐私问题削弱了它们的使用。从 RGB 视频生成的轮廓可以在一定程度上帮助缓解隐私问题。然而,使用轮廓会使得区分不同的主体变得相当复杂,从而无法对自由生活、多居住环境中的数据进行针对特定主体的分析。这个限制可以通过传感器的策略融合来克服,该传感器融合涉及可穿戴加速度计设备,可以与轮廓视频数据结合使用,将视频片段与被监测的特定患者匹配。所提出的方法同时解决了使用轮廓进行 Person ReID 的问题,并使家庭监测系统能够采用传感器融合技术进行数据分析。我们开发了一个多模态深度学习检测框架,该框架将短视频片段和加速度映射到一个潜在空间中,可以在该空间中测量欧几里得距离以匹配视频和加速度流。我们在 SPHERE Calorie 数据集上训练我们的方法,在该数据集上,我们展示了平均 ROC 曲线下面积为 76.3%,分配准确率为 77.4%。此外,我们提出了一种新的三元组损失,证明了它可以提高性能和收敛速度。