Ahmed Md Sabbir, Rahman Arafat, Wang Zhiyuan, Rucker Mark, Barnes Laura E
Department of Systems and Information Engineering University of Virginia, VA, USA.
Proc Annu Int Conf Mob Comput Netw. 2024 Nov;2024:1805-1807. doi: 10.1145/3636534.3698866. Epub 2024 Dec 4.
While audio data shows promise in addressing various health challenges, there is a lack of research on on-device audio processing for smartwatches. Privacy concerns make storing raw audio and performing post-hoc analysis undesirable for many users. Additionally, current on-device audio processing systems for smartwatches are limited in their feature extraction capabilities, restricting their potential for understanding user behavior and health. We developed a real-time system for on-device audio processing on smartwatches, which takes an average of 1.78 minutes (SD = 0.07 min) to extract 22 spectral and rhythmic features from a 1-minute audio sample, using a small window size of 25 milliseconds. Using these extracted audio features on a public dataset, we developed and incorporated models into a watch to classify foreground and background speech in real-time. Our Random Forest-based model classifies speech with a balanced accuracy of 80.3%.
虽然音频数据在应对各种健康挑战方面显示出前景,但针对智能手表的设备端音频处理的研究却很匮乏。隐私问题使得存储原始音频并进行事后分析对许多用户来说并不可取。此外,当前用于智能手表的设备端音频处理系统在特征提取能力方面存在局限,限制了它们理解用户行为和健康状况的潜力。我们开发了一种用于智能手表设备端音频处理的实时系统,该系统使用25毫秒的小窗口大小,从1分钟的音频样本中提取22个频谱和节奏特征平均需要1.78分钟(标准差 = 0.07分钟)。在一个公共数据集上使用这些提取的音频特征,我们开发了模型并将其集成到手表中,以实时分类前景语音和背景语音。我们基于随机森林的模型对语音进行分类的平衡准确率为80.3%。