Suppr超能文献

基于音频-视频决策融合处理的独居个体多模态跌倒检测

Multimodal fall detection for solitary individuals based on audio-video decision fusion processing.

作者信息

Jiao Shiqin, Li Guoqi, Zhang Guiyang, Zhou Jiahao, Li Jihong

机构信息

School of Reliability and Systems Engineering, Beihang University, Beijing 100191, China.

Jinan Thomas School, Jinan, Shandong 250102, China.

出版信息

Heliyon. 2024 Apr 16;10(8):e29596. doi: 10.1016/j.heliyon.2024.e29596. eCollection 2024 Apr 30.

Abstract

Falls often pose significant safety risks to solitary individuals, especially the elderly. Implementing a fast and efficient fall detection system is an effective strategy to address this hidden danger. We propose a multimodal method based on audio and video. On the basis of using non-intrusive equipment, it reduces to a certain extent the false negative situation that the most commonly used video-based methods may face due to insufficient lighting conditions, exceeding the monitoring range, etc. Therefore, in the foreseeable future, methods based on audio and video fusion are expected to become the best solution for fall detection. Specifically, this article outlines the following methodology: the video-based model utilizes YOLOv7-Pose to extract key skeleton joints, which are then fed into a two stream Spatial Temporal Graph Convolutional Network (ST-GCN) for classification. Meanwhile, the audio-based model employs log-scaled mel spectrograms to capture different features, which are processed through the MobileNetV2 architecture for detection. The final decision fusion of the two results is achieved through linear weighting and Dempster-Shafer (D-S) theory. After evaluation, our multimodal fall detection method significantly outperforms the single modality method, especially the evaluation metric sensitivity increased from 81.67% in single video modality to 96.67% (linear weighting) and 97.50% (D-S theory), which emphasizing the effectiveness of integrating video and audio data to achieve more powerful and reliable fall detection in complex and diverse daily life environments.

摘要

跌倒常常给独居者带来重大安全风险,尤其是老年人。实施快速高效的跌倒检测系统是应对这一隐患的有效策略。我们提出一种基于音频和视频的多模态方法。在使用非侵入式设备的基础上,它在一定程度上减少了最常用的基于视频的方法可能因光照条件不足、超出监测范围等而面临的漏报情况。因此,在可预见的未来,基于音频和视频融合的方法有望成为跌倒检测的最佳解决方案。具体而言,本文概述了以下方法:基于视频的模型利用YOLOv7-Pose提取关键骨骼关节,然后将其输入双流时空图卷积网络(ST-GCN)进行分类。同时,基于音频的模型采用对数缩放的梅尔频谱图来捕捉不同特征,通过MobileNetV2架构进行处理以进行检测。通过线性加权和Dempster-Shafer(D-S)理论实现两个结果的最终决策融合。经过评估,我们的多模态跌倒检测方法显著优于单模态方法,尤其是评估指标敏感度从单视频模态的81.67%提高到了96.67%(线性加权)和97.50%(D-S理论),这强调了整合视频和音频数据以在复杂多样的日常生活环境中实现更强大、更可靠的跌倒检测的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/176d/11053201/13aeb14bbaae/gr001.jpg

相似文献

1
Multimodal fall detection for solitary individuals based on audio-video decision fusion processing.
Heliyon. 2024 Apr 16;10(8):e29596. doi: 10.1016/j.heliyon.2024.e29596. eCollection 2024 Apr 30.
2
Fall Detection Method for Infrared Videos Based on Spatial-Temporal Graph Convolutional Network.
Sensors (Basel). 2024 Jul 17;24(14):4647. doi: 10.3390/s24144647.
4
DepITCM: an audio-visual method for detecting depression.
Front Psychiatry. 2025 Jan 23;15:1466507. doi: 10.3389/fpsyt.2024.1466507. eCollection 2024.
5
Fall recognition using a three stream spatio temporal GCN model with adaptive feature aggregation.
Sci Rep. 2025 Mar 27;15(1):10635. doi: 10.1038/s41598-025-95508-7.
6
Multimodal depression detection based on an attention graph convolution and transformer.
Math Biosci Eng. 2025 Feb 27;22(3):652-676. doi: 10.3934/mbe.2025024.
7
Integrating audio and visual modalities for multimodal personality trait recognition hybrid deep learning.
Front Neurosci. 2023 Jan 6;16:1107284. doi: 10.3389/fnins.2022.1107284. eCollection 2022.
8
A robust multimodal detection system: physical exercise monitoring in long-term care environments.
Front Bioeng Biotechnol. 2024 Aug 8;12:1398291. doi: 10.3389/fbioe.2024.1398291. eCollection 2024.
9
Human Fall Detection Using 3D Multi-Stream Convolutional Neural Networks with Fusion.
Diagnostics (Basel). 2022 Dec 6;12(12):3060. doi: 10.3390/diagnostics12123060.

引用本文的文献

1
A Review of You Only Look Once Algorithms in Animal Phenotyping Applications.
Animals (Basel). 2025 Apr 13;15(8):1126. doi: 10.3390/ani15081126.

本文引用的文献

1
Fall Detection of Elderly People Using the Manifold of Positive Semidefinite Matrices.
J Imaging. 2021 Jul 6;7(7):109. doi: 10.3390/jimaging7070109.
2
Impacts of the COVID-19 crisis on single-person households in South Korea.
J Asian Econ. 2023 Feb;84:101557. doi: 10.1016/j.asieco.2022.101557. Epub 2022 Nov 11.
3
Vision-based human fall detection systems using deep learning: A review.
Comput Biol Med. 2022 Jul;146:105626. doi: 10.1016/j.compbiomed.2022.105626. Epub 2022 May 27.
4
Does living alone increase the consumption of social resources?
Environ Sci Pollut Res Int. 2022 Oct;29(47):71911-71922. doi: 10.1007/s11356-022-20892-w. Epub 2022 May 24.
5
Elderly Fall Detection Systems: A Literature Survey.
Front Robot AI. 2020 Jun 23;7:71. doi: 10.3389/frobt.2020.00071. eCollection 2020.
6
Automated remote fall detection using impact features from video and audio.
J Biomech. 2019 May 9;88:25-32. doi: 10.1016/j.jbiomech.2019.03.007. Epub 2019 Mar 18.
7
Home Camera-Based Fall Detection System for the Elderly.
Sensors (Basel). 2017 Dec 9;17(12):2864. doi: 10.3390/s17122864.
8
Fall Detection Using Smartphone Audio Features.
IEEE J Biomed Health Inform. 2016 Jul;20(4):1073-80. doi: 10.1109/JBHI.2015.2425932. Epub 2015 Apr 23.
9
Fall detection based on body part tracking using a depth camera.
IEEE J Biomed Health Inform. 2015 Mar;19(2):430-9. doi: 10.1109/JBHI.2014.2319372. Epub 2014 Apr 23.
10
A microphone array system for automatic fall detection.
IEEE Trans Biomed Eng. 2012 May;59(5):1291-301. doi: 10.1109/TBME.2012.2186449.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验