Jeon Eun Som, Choi Hongjun, Shukla Ankita, Wang Yuan, Lee Hyunglae, Buman Matthew P, Turaga Pavan
Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State, University, Tempe, 85281, AZ, USA.
Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, 29208, SC, USA.
Eng Appl Artif Intell. 2024 Apr;130. doi: 10.1016/j.engappai.2023.107719. Epub 2023 Dec 20.
Deep learning methods have achieved a lot of success in various applications involving converting wearable sensor data to actionable health insights. A common application areas is activity recognition, where deep-learning methods still suffer from limitations such as sensitivity to signal quality, sensor characteristic variations, and variability between subjects. To mitigate these issues, robust features obtained by topological data analysis (TDA) have been suggested as a potential solution. However, there are two significant obstacles to using topological features in deep learning: (1) large computational load to extract topological features using TDA, and (2) different signal representations obtained from deep learning and TDA which makes fusion difficult. In this paper, to enable integration of the strengths of topological methods in deep-learning for time-series data, we propose to use two teacher networks - one trained on the raw time-series data, and another trained on persistence images generated by TDA methods. These two teachers are jointly used to distill a single student model, which utilizes only the raw time-series data at test-time. This approach addresses both issues. The use of KD with multiple teachers utilizes complementary information, and results in a compact model with strong supervisory features and an integrated richer representation. To assimilate desirable information from different modalities, we design new constraints, including orthogonality imposed on feature correlation maps for improving feature expressiveness and allowing the student to easily learn from the teacher. Also, we apply an annealing strategy in KD for fast saturation and better accommodation from different features, while the knowledge gap between the teachers and student is reduced. Finally, a robust student model is distilled, which can at test-time uses only the time-series data as an input, while implicitly preserving topological features. The experimental results demonstrate the effectiveness of the proposed method on wearable sensor data. The proposed method shows 71.74% in classification accuracy on GENEActiv with WRN16-1 (1D CNNs) student, which outperforms baselines and takes much less processing time (less than 17 sec) than teachers on 6k testing samples.
深度学习方法在将可穿戴传感器数据转化为可操作的健康见解的各种应用中取得了很大成功。一个常见的应用领域是活动识别,在这个领域中,深度学习方法仍然存在一些局限性,比如对信号质量、传感器特性变化以及个体之间差异的敏感性。为了缓解这些问题,有人提出通过拓扑数据分析(TDA)获得的鲁棒特征作为一种潜在的解决方案。然而,在深度学习中使用拓扑特征存在两个重大障碍:(1)使用TDA提取拓扑特征的计算量很大,(2)深度学习和TDA获得的信号表示不同,这使得融合变得困难。在本文中,为了能够将拓扑方法的优势整合到用于时间序列数据的深度学习中,我们建议使用两个教师网络——一个在原始时间序列数据上训练,另一个在由TDA方法生成的持久图像上训练。这两个教师网络共同用于提炼一个单一的学生模型,该模型在测试时仅使用原始时间序列数据。这种方法解决了这两个问题。使用多个教师的知识蒸馏(KD)利用了互补信息,并产生了一个具有强大监督特征和更丰富综合表示的紧凑模型。为了从不同模态中吸收理想信息,我们设计了新的约束,包括对特征相关图施加正交性以提高特征表现力,并允许学生轻松地向教师学习。此外,我们在KD中应用退火策略以实现快速饱和并更好地适应不同特征,同时缩小教师和学生之间的知识差距。最后,提炼出一个鲁棒的学生模型,该模型在测试时仅将时间序列数据作为输入,同时隐含地保留拓扑特征。实验结果证明了所提方法在可穿戴传感器数据上的有效性。所提方法在使用WRN16 - 1(1D卷积神经网络)学生模型对GENEActiv数据进行分类时,准确率达到71.74%,优于基线方法,并且在6000个测试样本上的处理时间(不到17秒)比教师网络少得多。