通过多模态人类语音和运动数据的深度学习实现中风的早期识别。

Early identification of stroke through deep learning with multi-modal human speech and movement data.

作者信息

Ou Zijun, Wang Haitao, Zhang Bin, Liang Haobang, Hu Bei, Ren Longlong, Liu Yanjuan, Zhang Yuhu, Dai Chengbo, Wu Hejun, Li Weifeng, Li Xin

机构信息

School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, Guangdong Province, China.

Department of Neurology, Guangdong Neuroscience Institute, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, Guangdong Province, China.

出版信息

Neural Regen Res. 2025 Jan 1;20(1):234-241. doi: 10.4103/1673-5374.393103. Epub 2024 Jan 8.

DOI:10.4103/1673-5374.393103

PMID:38767488

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11246124/

Abstract

JOURNAL/nrgr/04.03/01300535-202501000-00031/figure1/v/2024-05-14T021156Z/r/image-tiff Early identification and treatment of stroke can greatly improve patient outcomes and quality of life. Although clinical tests such as the Cincinnati Pre-hospital Stroke Scale (CPSS) and the Face Arm Speech Test (FAST) are commonly used for stroke screening, accurate administration is dependent on specialized training. In this study, we proposed a novel multimodal deep learning approach, based on the FAST, for assessing suspected stroke patients exhibiting symptoms such as limb weakness, facial paresis, and speech disorders in acute settings. We collected a dataset comprising videos and audio recordings of emergency room patients performing designated limb movements, facial expressions, and speech tests based on the FAST. We compared the constructed deep learning model, which was designed to process multi-modal datasets, with six prior models that achieved good action classification performance, including the I3D, SlowFast, X3D, TPN, TimeSformer, and MViT. We found that the findings of our deep learning model had a higher clinical value compared with the other approaches. Moreover, the multi-modal model outperformed its single-module variants, highlighting the benefit of utilizing multiple types of patient data, such as action videos and speech audio. These results indicate that a multi-modal deep learning model combined with the FAST could greatly improve the accuracy and sensitivity of early stroke identification of stroke, thus providing a practical and powerful tool for assessing stroke patients in an emergency clinical setting.

摘要

《期刊》/nrgr/04.03/01300535 - 202501000 - 00031/图1/v/2024 - 05 - 14T021156Z/图像 - tiff格式早期识别和治疗中风可显著改善患者预后和生活质量。尽管诸如辛辛那提院前中风量表（CPSS）和面部 - 手臂 - 言语测试（FAST）等临床测试常用于中风筛查，但准确实施依赖于专业培训。在本研究中，我们基于FAST提出了一种新颖的多模态深度学习方法，用于评估在急性情况下出现肢体无力、面部麻痹和言语障碍等症状的疑似中风患者。我们收集了一个数据集，该数据集包含急诊室患者基于FAST进行指定肢体运动、面部表情和言语测试的视频和音频记录。我们将设计用于处理多模态数据集的构建深度学习模型与六个在动作分类方面取得良好性能的先前模型进行了比较，这六个模型包括I3D、SlowFast、X3D、TPN、TimeSformer和MViT。我们发现，与其他方法相比，我们的深度学习模型的结果具有更高的临床价值。此外，多模态模型优于其单模块变体，突出了利用多种类型患者数据（如动作视频和语音音频）的益处。这些结果表明，结合FAST的多模态深度学习模型可大大提高中风早期识别的准确性和敏感性，从而为急诊临床环境中评估中风患者提供一个实用且强大的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f1c/11246124/0949720a82e2/NRR-20-234-g002.jpg

相似文献

Early identification of stroke through deep learning with multi-modal human speech and movement data.

Neural Regen Res. 2025 Jan 1;20(1):234-241. doi: 10.4103/1673-5374.393103. Epub 2024 Jan 8.

DeepStroke: An efficient stroke screening framework for emergency rooms with multimodal adversarial deep learning.

Med Image Anal. 2022 Aug;80:102522. doi: 10.1016/j.media.2022.102522. Epub 2022 Jun 25.

Prehospital stroke scales as screening tools for early identification of stroke and transient ischemic attack.

Cochrane Database Syst Rev. 2019 Apr 9;4(4):CD011427. doi: 10.1002/14651858.CD011427.pub2.

A Multimodal Pain Sentiment Analysis System Using Ensembled Deep Learning Approaches for IoT-Enabled Healthcare Framework.

Sensors (Basel). 2025 Feb 17;25(4):1223. doi: 10.3390/s25041223.

Exploring Deep Transfer Learning Techniques for Alzheimer's Dementia Detection.

Front Comput Sci. 2021 May;3. doi: 10.3389/fcomp.2021.624683. Epub 2021 May 12.

Multi-modal deep learning for automated assembly of periapical radiographs.

J Dent. 2023 Aug;135:104588. doi: 10.1016/j.jdent.2023.104588. Epub 2023 Jun 21.

Comparison of eight prehospital stroke scales to detect intracranial large-vessel occlusion in suspected stroke (PRESTO): a prospective observational study.

Lancet Neurol. 2021 Mar;20(3):213-221. doi: 10.1016/S1474-4422(20)30439-7. Epub 2021 Jan 7.

A comprehensive framework for multi-modal hate speech detection in social media using deep learning.

Sci Rep. 2025 Apr 15;15(1):13020. doi: 10.1038/s41598-025-94069-z.

Development of a multi-modal learning-based lymph node metastasis prediction model for lung cancer.

Clin Imaging. 2024 Oct;114:110254. doi: 10.1016/j.clinimag.2024.110254. Epub 2024 Aug 9.

Impact of bilingual face, arm, speech, time (FAST) public awareness campaigns on emergency medical services (EMS) activation in a large Canadian metropolitan area.

CJEM. 2023 May;25(5):403-410. doi: 10.1007/s43678-023-00482-6. Epub 2023 Apr 3.

引用本文的文献

Mapping artificial intelligence models in emergency medicine: A scoping review on artificial intelligence performance in emergency care and education.

Turk J Emerg Med. 2025 Apr 1;25(2):67-91. doi: 10.4103/tjem.tjem_45_25. eCollection 2025 Apr-Jun.

Application of Artificial Intelligence in Acute Ischemic Stroke: A Scoping Review.

Neurointervention. 2024 Mar;20(1):4-14. doi: 10.5469/neuroint.2025.00052. Epub 2025 Feb 18.

Conceptual understanding and cognitive patterns construction for physical education teaching based on deep learning algorithms.

Sci Rep. 2024 Dec 28;14(1):31409. doi: 10.1038/s41598-024-83028-9.

本文引用的文献

Machine learning applications in stroke medicine: advancements, challenges, and future prospectives.

Neural Regen Res. 2024 Apr;19(4):769-773. doi: 10.4103/1673-5374.382228.

Detection of Alzheimer's disease onset using MRI and PET neuroimaging: longitudinal data analysis and machine learning.

Neural Regen Res. 2023 Oct;18(10):2134-2140. doi: 10.4103/1673-5374.367840.

Decoding degeneration: the implementation of machine learning for clinical detection of neurodegenerative disorders.

Neural Regen Res. 2023 Jun;18(6):1235-1242. doi: 10.4103/1673-5374.355982.

Treatment and 1-Year Prognosis of Ischemic Stroke in China in 2018: A Hospital-Based Study From Bigdata Observatory Platform for Stroke of China.

Stroke. 2022 Sep;53(9):e415-e417. doi: 10.1161/STROKEAHA.121.038260. Epub 2022 Jul 22.

DeepStroke: An efficient stroke screening framework for emergency rooms with multimodal adversarial deep learning.

Med Image Anal. 2022 Aug;80:102522. doi: 10.1016/j.media.2022.102522. Epub 2022 Jun 25.

Weakly unsupervised conditional generative adversarial network for image-based prognostic prediction for COVID-19 patients based on chest CT.

Med Image Anal. 2021 Oct;73:102159. doi: 10.1016/j.media.2021.102159. Epub 2021 Jul 11.

Whole slide images based cancer survival prediction using attention guided deep multiple instance learning networks.

Med Image Anal. 2020 Oct;65:101789. doi: 10.1016/j.media.2020.101789. Epub 2020 Jul 19.

Convolutional neural networks for classification of Alzheimer's disease: Overview and reproducible evaluation.

Med Image Anal. 2020 Jul;63:101694. doi: 10.1016/j.media.2020.101694. Epub 2020 May 1.

CT perfusion core and ASPECT score prediction of outcomes in DEFUSE 3.

Int J Stroke. 2021 Apr;16(3):288-294. doi: 10.1177/1747493020915141. Epub 2020 Mar 31.

Hi-Net: Hybrid-Fusion Network for Multi-Modal MR Image Synthesis.

IEEE Trans Med Imaging. 2020 Sep;39(9):2772-2781. doi: 10.1109/TMI.2020.2975344. Epub 2020 Feb 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过多模态人类语音和运动数据的深度学习实现中风的早期识别。

Early identification of stroke through deep learning with multi-modal human speech and movement data.

作者信息

Ou Zijun, Wang Haitao, Zhang Bin, Liang Haobang, Hu Bei, Ren Longlong, Liu Yanjuan, Zhang Yuhu, Dai Chengbo, Wu Hejun, Li Weifeng, Li Xin

机构信息

School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, Guangdong Province, China.

出版信息

Neural Regen Res. 2025 Jan 1;20(1):234-241. doi: 10.4103/1673-5374.393103. Epub 2024 Jan 8.

DOI:10.4103/1673-5374.393103

PMID:38767488

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11246124/

Abstract

摘要

通过多模态人类语音和运动数据的深度学习实现中风的早期识别。

Early identification of stroke through deep learning with multi-modal human speech and movement data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

通过多模态人类语音和运动数据的深度学习实现中风的早期识别。

Early identification of stroke through deep learning with multi-modal human speech and movement data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献