Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:2442-2446. doi: 10.1109/EMBC48229.2022.9871046.
Missing data is a very common challenge in health monitoring systems and one reason for that is that they are largely dependent on different types of sensors. A critical characteristic of the sensor-based prediction systems is their dependency on hardware, which is prone to physical limitations that add another layer of complexity to the algorithmic component of the system. For instance, it might not be realistic to assume that the prediction model has access to all sensors at all times. This can happen in the real-world setup if one or more sensors on a device malfunction or temporarily have to be disabled due to power limitations. The consequence of such a scenario is that the model faces "missing input data" from those unavailable sensors at the deployment time, and as a result, the quality of prediction can degrade significantly. While the missing input data is a very well-known problem, to the best of our knowledge, no study has been done to efficiently minimize the performance drop when one or more sensors may be unavailable for a significant amount of time. The sensor failure problem investigated in this paper can be viewed as a spatial missing data problem, which has not been explored to date. In this work, we show that the naive known methods of dealing with missing input data such as zero-filling or mean-filling are not suitable for senors-based prediction and we propose an algorithm that can reconstruct the missing input data for unavailable sensors. Moreover, we show that on the MobiAct, MotionSense, and MHEALTH activity classification benchmarks, our proposed method can outperform the baselines by large accuracy margins of 8.2%, 15.1%, and 11.6%, respectively.
数据缺失是健康监测系统中非常常见的挑战,其中一个原因是它们在很大程度上依赖于不同类型的传感器。基于传感器的预测系统的一个关键特征是它们对硬件的依赖,这容易受到物理限制的影响,从而给系统的算法组件增加了另一层复杂性。例如,假设预测模型始终可以访问所有传感器可能并不现实。如果设备上的一个或多个传感器出现故障或由于电源限制而暂时需要禁用,那么在实际设置中就会发生这种情况。这种情况的后果是,模型在部署时会遇到那些不可用传感器的“缺失输入数据”,因此预测的质量会显著下降。虽然缺失输入数据是一个非常常见的问题,但据我们所知,还没有研究致力于在一个或多个传感器可能长时间不可用时,有效地最小化性能下降。本文研究的传感器故障问题可以看作是一个空间缺失数据问题,迄今为止尚未对此进行探讨。在这项工作中,我们表明,处理缺失输入数据的常见已知方法,如零填充或均值填充,不适用于基于传感器的预测,我们提出了一种可以重建不可用传感器的缺失输入数据的算法。此外,我们表明,在 MobiAct、MotionSense 和 MHEALTH 活动分类基准上,我们的方法可以分别以 8.2%、15.1%和 11.6%的大准确率优势优于基线。