Wang Zhen, Song Yingzhe, Pang Lei, Li Shanjun, Sun Gang
Institute of Artificial Intelligence in Sports, Capital University of Physical Education and Sports, Beijing 100191, China.
Sensors (Basel). 2025 Jun 29;25(13):4062. doi: 10.3390/s25134062.
Dynamic oxygen uptake (VO) reflects moment-to-moment changes in oxygen consumption during exercise and underpins training design, performance enhancement, and clinical decision-making. We tackled two key obstacles-the limited fusion of heterogeneous sensor data and inadequate modeling of long-range temporal patterns-by integrating wearable accelerometer and heart-rate streams with a convolutional neural network-LSTM (CNN-LSTM) architecture and optional attention modules. Physiological signals and VO were recorded from 21 adults through resting assessment and cardiopulmonary exercise testing. The results showed that pairing accelerometer with heart-rate inputs improves prediction compared with considering the heart rate alone. The baseline CNN-LSTM reached = 0.946, outperforming a plain LSTM ( = 0.926) thanks to stronger local spatio-temporal feature extraction. Introducing a spatial attention mechanism raised accuracy further ( = 0.962), whereas temporal attention reduced it ( = 0.930), indicating that attention success depends on how well the attended features align with exercise dynamics. Stacking both attentions (spatio-temporal) yielded = 0.960, slightly below the value for spatial attention alone, implying that added complexity does not guarantee better performance. Across all models, prediction errors grew during high-intensity bouts, highlighting a bottleneck in capturing non-linear physiological responses under heavy load. These findings inform architecture selection for wearable metabolic monitoring and clarify when attention mechanisms add value.
动态摄氧量(VO)反映了运动过程中氧气消耗的瞬间变化,是训练设计、提高运动表现和临床决策的基础。我们通过将可穿戴式加速度计和心率数据流与卷积神经网络-长短期记忆网络(CNN-LSTM)架构以及可选的注意力模块相结合,克服了两个关键障碍——异构传感器数据的有限融合以及远程时间模式建模不足的问题。通过静息评估和心肺运动测试,记录了21名成年人的生理信号和VO。结果表明,与仅考虑心率相比,将加速度计与心率输入配对可提高预测效果。基线CNN-LSTM的准确率达到 = 0.946,由于更强的局部时空特征提取能力,其表现优于普通的LSTM( = 0.926)。引入空间注意力机制进一步提高了准确率( = 0.962),而时间注意力则降低了准确率( = 0.930),这表明注意力的成功与否取决于所关注的特征与运动动态的匹配程度。同时堆叠两种注意力(时空注意力)的准确率为 = 0.960,略低于仅使用空间注意力的值,这意味着增加的复杂性并不能保证更好的性能。在所有模型中,高强度运动期间预测误差都会增加,这突出了在捕捉重负荷下的非线性生理反应方面存在的瓶颈。这些发现为可穿戴代谢监测的架构选择提供了参考,并阐明了注意力机制何时能增加价值。