Yang Hongyu, Dong Rou, Guo Rong, Che Yonglin, Xie Xiaolong, Yang Jianke, Zhang Jiajin
College of Mechanical and Electrical Engineering, Yunnan Agricultural University, Kunming 650201, China.
Center for Sports Intelligence Innovation and Application, Yunnan Agricultural University, Kunming 650201, China.
Sensors (Basel). 2025 Mar 12;25(6):1746. doi: 10.3390/s25061746.
The demand for intelligent monitoring systems tailored to elderly living environments is rapidly increasing worldwide with population aging. Traditional acoustic scene monitoring systems that rely on cloud computing are limited by data transmission delays and privacy concerns. Hence, this study proposes an acoustic scene recognition system that integrates edge computing with deep learning to enable real-time monitoring of elderly individuals' daily activities. The system consists of low-power edge devices equipped with multiple microphones, portable wearable components, and compact power modules, ensuring its seamless integration into the daily lives of the elderly. We developed four deep learning models-convolutional neural network, long short-term memory, bidirectional long short-term memory, and deep neural network-and used model quantization techniques to reduce the computational complexity and memory usage, thereby optimizing them to meet edge device constraints. The CNN model demonstrated superior performance compared to the other models, achieving 98.5% accuracy, an inference time of 2.4 ms, and low memory requirements (25.63 KB allocated for Flash and 5.15 KB for RAM). This architecture provides an efficient, reliable, and user-friendly solution for real-time acoustic scene monitoring in elderly care.
随着全球人口老龄化,对适用于老年人居住环境的智能监测系统的需求正在迅速增长。传统的依赖云计算的声学场景监测系统受到数据传输延迟和隐私问题的限制。因此,本研究提出了一种将边缘计算与深度学习相结合的声学场景识别系统,以实现对老年人日常活动的实时监测。该系统由配备多个麦克风的低功耗边缘设备、便携式可穿戴组件和紧凑型电源模块组成,确保其无缝融入老年人的日常生活。我们开发了四种深度学习模型——卷积神经网络、长短期记忆网络、双向长短期记忆网络和深度神经网络——并使用模型量化技术来降低计算复杂度和内存使用,从而对它们进行优化以满足边缘设备的限制。与其他模型相比,卷积神经网络模型表现出卓越的性能,准确率达到98.5%,推理时间为2.4毫秒,内存需求较低(为闪存分配25.63千字节,为随机存取存储器分配5.15千字节)。这种架构为老年护理中的实时声学场景监测提供了一种高效、可靠且用户友好的解决方案。