Jeon Eun Som, Choi Hongjun, Shukla Ankita, Wang Yuan, Buman Matthew P, Turaga Pavan
Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281 USA.
Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC 29208 USA.
Conf Rec Asilomar Conf Signals Syst Comput. 2022 Oct-Nov;2022:837-842. doi: 10.1109/ieeeconf56349.2022.10052019. Epub 2023 Mar 7.
Converting wearable sensor data to actionable health insights has witnessed large interest in recent years. Deep learning methods have been utilized in and have achieved a lot of successes in various applications involving wearables fields. However, wearable sensor data has unique issues related to sensitivity and variability between subjects, and dependency on sampling-rate for analysis. To mitigate these issues, a different type of analysis using topological data analysis has shown promise as well. Topological data analysis (TDA) captures robust features, such as persistence images (PI), in complex data through the persistent homology algorithm, which holds the promise of boosting machine learning performance. However, because of the computational load required by TDA methods for large-scale data, integration and implementation has lagged behind. Further, many applications involving wearables require models to be compact enough to allow deployment on edge-devices. In this context, knowledge distillation (KD) has been widely applied to generate a small model (student model), using a pre-trained high-capacity network (teacher model). In this paper, we propose a new KD strategy using two teacher models - one that uses the raw time-series and another that uses persistence images from the time-series. These two teachers then train a student using KD. In essence, the student learns from heterogeneous teachers providing different knowledge. To consider different properties in features from teachers, we apply an annealing strategy and adaptive temperature in KD. Finally, a robust student model is distilled, which utilizes the time series data only. We find that incorporation of persistence features via second teacher leads to significantly improved performance. This approach provides a unique way of fusing deep-learning with topological features to develop effective models.
近年来,将可穿戴传感器数据转化为可操作的健康见解引发了广泛关注。深度学习方法已被应用于涉及可穿戴设备领域的各种应用中,并取得了许多成功。然而,可穿戴传感器数据存在与受试者之间的敏感性和变异性相关的独特问题,以及对分析采样率的依赖性。为了缓解这些问题,使用拓扑数据分析的另一种类型的分析也显示出了前景。拓扑数据分析(TDA)通过持久同调算法在复杂数据中捕获稳健特征,如持久图像(PI),这有望提高机器学习性能。然而,由于TDA方法处理大规模数据所需的计算量,其集成和实现滞后。此外,许多涉及可穿戴设备的应用要求模型足够紧凑,以便能够部署在边缘设备上。在这种背景下,知识蒸馏(KD)已被广泛应用于使用预训练的高容量网络(教师模型)生成小型模型(学生模型)。在本文中,我们提出了一种新的KD策略,使用两个教师模型——一个使用原始时间序列,另一个使用时间序列的持久图像。然后,这两个教师使用KD训练一个学生。本质上,学生从提供不同知识的异构教师那里学习。为了考虑教师特征中的不同属性,我们在KD中应用了退火策略和自适应温度。最后,蒸馏出一个稳健的学生模型,该模型仅利用时间序列数据。我们发现,通过第二个教师纳入持久特征可显著提高性能。这种方法提供了一种将深度学习与拓扑特征融合以开发有效模型的独特方式。