使用深度学习检测语音记录中的痴呆症：弗雷明汉心脏研究。

Detection of dementia on voice recordings using deep learning: a Framingham Heart Study.

机构信息

Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, 72 E. Concord Street, Evans 636, Boston, MA, 02118, USA.

The Framingham Heart Study, Boston University, Boston, MA, 02118, USA.

出版信息

Alzheimers Res Ther. 2021 Aug 31;13(1):146. doi: 10.1186/s13195-021-00888-3.

DOI:10.1186/s13195-021-00888-3

PMID:34465384

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8409004/

Abstract

BACKGROUND

Identification of reliable, affordable, and easy-to-use strategies for detection of dementia is sorely needed. Digital technologies, such as individual voice recordings, offer an attractive modality to assess cognition but methods that could automatically analyze such data are not readily available.

METHODS AND FINDINGS

We used 1264 voice recordings of neuropsychological examinations administered to participants from the Framingham Heart Study (FHS), a community-based longitudinal observational study. The recordings were 73 min in duration, on average, and contained at least two speakers (participant and examiner). Of the total voice recordings, 483 were of participants with normal cognition (NC), 451 recordings were of participants with mild cognitive impairment (MCI), and 330 were of participants with dementia (DE). We developed two deep learning models (a two-level long short-term memory (LSTM) network and a convolutional neural network (CNN)), which used the audio recordings to classify if the recording included a participant with only NC or only DE and to differentiate between recordings corresponding to those that had DE from those who did not have DE (i.e., NDE (NC+MCI)). Based on 5-fold cross-validation, the LSTM model achieved a mean (±std) area under the receiver operating characteristic curve (AUC) of 0.740 ± 0.017, mean balanced accuracy of 0.647 ± 0.027, and mean weighted F1 score of 0.596 ± 0.047 in classifying cases with DE from those with NC. The CNN model achieved a mean AUC of 0.805 ± 0.027, mean balanced accuracy of 0.743 ± 0.015, and mean weighted F1 score of 0.742 ± 0.033 in classifying cases with DE from those with NC. For the task related to the classification of participants with DE from NDE, the LSTM model achieved a mean AUC of 0.734 ± 0.014, mean balanced accuracy of 0.675 ± 0.013, and mean weighted F1 score of 0.671 ± 0.015. The CNN model achieved a mean AUC of 0.746 ± 0.021, mean balanced accuracy of 0.652 ± 0.020, and mean weighted F1 score of 0.635 ± 0.031 in classifying cases with DE from those who were NDE.

CONCLUSION

This proof-of-concept study demonstrates that automated deep learning-driven processing of audio recordings of neuropsychological testing performed on individuals recruited within a community cohort setting can facilitate dementia screening.

摘要

背景

迫切需要识别可靠、负担得起且易于使用的策略来检测痴呆症。数字技术，如个人语音记录，提供了评估认知的有吸引力的方式，但能够自动分析此类数据的方法尚不可用。

方法和发现

我们使用了来自弗雷明汉心脏研究（Framingham Heart Study，FHS）的参与者的 1264 份神经心理学检查的语音记录，这是一项基于社区的纵向观察性研究。这些记录的平均持续时间为 73 分钟，并且至少包含两个说话者（参与者和检查者）。在总共的语音记录中，483 份来自认知正常（NC）的参与者，451 份来自轻度认知障碍（MCI）的参与者，330 份来自痴呆症（DE）的参与者。我们开发了两个深度学习模型（两级长短时记忆（LSTM）网络和卷积神经网络（CNN）），这些模型使用音频记录来分类记录中是否包含仅 NC 或仅 DE 的参与者，并区分与 DE 相关的记录与不具有 DE 的记录（即 NDE（NC+MCI））。基于 5 折交叉验证，LSTM 模型在将 DE 病例与 NC 病例分类时的平均（±标准）接收者操作特征曲线（AUC）为 0.740 ± 0.017，平均平衡准确性为 0.647 ± 0.027，平均加权 F1 分数为 0.596 ± 0.047。CNN 模型在将 DE 病例与 NC 病例分类时的平均 AUC 为 0.805 ± 0.027，平均平衡准确性为 0.743 ± 0.015，平均加权 F1 分数为 0.742 ± 0.033。对于与从 NDE 中分类 DE 参与者相关的任务，LSTM 模型的平均 AUC 为 0.734 ± 0.014，平均平衡准确性为 0.675 ± 0.013，平均加权 F1 分数为 0.671 ± 0.015。CNN 模型在从 NDE 中分类 DE 病例时的平均 AUC 为 0.746 ± 0.021，平均平衡准确性为 0.652 ± 0.020，平均加权 F1 分数为 0.635 ± 0.031。

结论

这项概念验证研究表明，在社区队列环境中招募的个体进行神经心理学测试的音频记录的自动化深度学习驱动处理可以促进痴呆症筛查。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/020e/8409004/0cc7ed943b01/13195_2021_888_Fig1_HTML.jpg

相似文献

Detection of dementia on voice recordings using deep learning: a Framingham Heart Study.

Alzheimers Res Ther. 2021 Aug 31;13(1):146. doi: 10.1186/s13195-021-00888-3.

Fusion of Low-Level Descriptors of Digital Voice Recordings for Dementia Assessment.

J Alzheimers Dis. 2023;96(2):507-514. doi: 10.3233/JAD-230560.

Detection of Mild Cognitive Impairment From Non-Semantic, Acoustic Voice Features: The Framingham Heart Study.

JMIR Aging. 2024 Aug 22;7:e55126. doi: 10.2196/55126.

Automated detection of mild cognitive impairment and dementia from voice recordings: A natural language processing approach.

Alzheimers Dement. 2023 Mar;19(3):946-955. doi: 10.1002/alz.12721. Epub 2022 Jul 7.

A Stable and Scalable Digital Composite Neurocognitive Test for Early Dementia Screening Based on Machine Learning: Model Development and Validation Study.

J Med Internet Res. 2023 Dec 1;25:e49147. doi: 10.2196/49147.

Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework.

Int J Lang Commun Disord. 2023 Mar;58(2):279-294. doi: 10.1111/1460-6984.12783. Epub 2022 Sep 18.

Association Between Acoustic Features and Neuropsychological Test Performance in the Framingham Heart Study: Observational Study.

J Med Internet Res. 2022 Dec 22;24(12):e42886. doi: 10.2196/42886.

Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.

J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.

Validation of a Harmonised, Three-Item Cognitive Screening Instrument for the Survey of Health, Ageing and Retirement in Europe (SHARE-Cog).

Int J Environ Res Public Health. 2023 Sep 30;20(19):6869. doi: 10.3390/ijerph20196869.

Prediction of cognitive impairment via deep learning trained with multi-center neuropsychological test data.

BMC Med Inform Decis Mak. 2019 Nov 21;19(1):231. doi: 10.1186/s12911-019-0974-x.

引用本文的文献

Development and classification accuracy of an automated cognitive screening tool combining working memory and connected speech tasks for early detection of cognitive impairment in primary care.

Alzheimers Dement (N Y). 2025 Aug 18;11(3):e70145. doi: 10.1002/trc2.70145. eCollection 2025 Jul-Sep.

Listening to the Mind: Integrating Vocal Biomarkers into Digital Health.

Brain Sci. 2025 Jul 18;15(7):762. doi: 10.3390/brainsci15070762.

Obfuscation via pitch-shifting for balancing privacy and diagnostic utility in voice-based cognitive assessment.

Alzheimers Dement. 2025 Feb;21(3):e70032. doi: 10.1002/alz.70032.

Obfuscation via pitch-shifting for balancing privacy and diagnostic utility in voice-based cognitive assessment.

medRxiv. 2024 Nov 28:2024.11.25.24317900. doi: 10.1101/2024.11.25.24317900.

Detection of Mild Cognitive Impairment From Non-Semantic, Acoustic Voice Features: The Framingham Heart Study.

JMIR Aging. 2024 Aug 22;7:e55126. doi: 10.2196/55126.

Speech Emotion Recognition in People at High Risk of Dementia.

Dement Neurocogn Disord. 2024 Jul;23(3):146-160. doi: 10.12779/dnd.2024.23.3.146. Epub 2024 Jul 24.

Sound as a bell: a deep learning approach for health status classification through speech acoustic biomarkers.

Chin Med. 2024 Jul 24;19(1):101. doi: 10.1186/s13020-024-00973-3.

Voice as a Biomarker of Pediatric Health: A Scoping Review.

Children (Basel). 2024 Jun 4;11(6):684. doi: 10.3390/children11060684.

Prediction of Alzheimer's disease progression within 6 years using speech: A novel approach leveraging language models.

Alzheimers Dement. 2024 Aug;20(8):5262-5270. doi: 10.1002/alz.13886. Epub 2024 Jun 25.

Actual Clinical Practice Assessment: A Rapid and Easy-to-Use Tool for Evaluating Cognitive Decline Equivalent to Dementia.

Cureus. 2024 Apr 22;16(4):e58781. doi: 10.7759/cureus.58781. eCollection 2024 Apr.

本文引用的文献

Efficiently Classifying Lung Sounds through Depthwise Separable CNN Models with Fused STFT and MFCC Features.

Diagnostics (Basel). 2021 Apr 20;11(4):732. doi: 10.3390/diagnostics11040732.

Severity Distribution of Alzheimer's Disease Dementia and Mild Cognitive Impairment in the Framingham Heart Study.

J Alzheimers Dis. 2021;79(2):807-817. doi: 10.3233/JAD-200786.

Linguistic markers predict onset of Alzheimer's disease.

EClinicalMedicine. 2020 Oct 22;28:100583. doi: 10.1016/j.eclinm.2020.100583. eCollection 2020 Nov.

Heart sound classification based on improved MFCC features and convolutional recurrent neural networks.

Neural Netw. 2020 Oct;130:22-32. doi: 10.1016/j.neunet.2020.06.015. Epub 2020 Jun 23.

Assessing the Utility of Language and Voice Biomarkers to Predict Cognitive Impairment in the Framingham Heart Study Cognitive Aging Cohort Data.

J Alzheimers Dis. 2020;76(3):905-922. doi: 10.3233/JAD-190783.

Depression Screening from Voice Samples of Patients Affected by Parkinson's Disease.

Digit Biomark. 2019 May-Aug;3(2):72-82. doi: 10.1159/000500354. Epub 2019 Jun 12.

Investigating voice as a biomarker: Deep phenotyping methods for early detection of Parkinson's disease.

J Biomed Inform. 2020 Apr;104:103362. doi: 10.1016/j.jbi.2019.103362. Epub 2019 Dec 19.

Quantifying the nativeness of antibody sequences using long short-term memory networks.

Protein Eng Des Sel. 2019 Dec 31;32(7):347-354. doi: 10.1093/protein/gzz031.

Developing a large scale population screening tool for the assessment of Parkinson's disease using telephone-quality voice.

J Acoust Soc Am. 2019 May;145(5):2871. doi: 10.1121/1.5100272.

Digital biomarkers for Alzheimer's disease: the mobile/ wearable devices opportunity.

NPJ Digit Med. 2019;2. doi: 10.1038/s41746-019-0084-2. Epub 2019 Feb 21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用深度学习检测语音记录中的痴呆症：弗雷明汉心脏研究。

Detection of dementia on voice recordings using deep learning: a Framingham Heart Study.

机构信息

出版信息

BACKGROUND

METHODS AND FINDINGS

CONCLUSION

背景

方法和发现

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献