Suppr超能文献

从非语义、声学语音特征检测轻度认知障碍:弗雷明汉心脏研究。

Detection of Mild Cognitive Impairment From Non-Semantic, Acoustic Voice Features: The Framingham Heart Study.

机构信息

Department of Anatomy and Neurobiology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, United States.

The Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, United States.

出版信息

JMIR Aging. 2024 Aug 22;7:e55126. doi: 10.2196/55126.

Abstract

BACKGROUND

With the aging global population and the rising burden of Alzheimer disease and related dementias (ADRDs), there is a growing focus on identifying mild cognitive impairment (MCI) to enable timely interventions that could potentially slow down the onset of clinical dementia. The production of speech by an individual is a cognitively complex task that engages various cognitive domains. The ease of audio data collection highlights the potential cost-effectiveness and noninvasive nature of using human speech as a tool for cognitive assessment.

OBJECTIVE

This study aimed to construct a machine learning pipeline that incorporates speaker diarization, feature extraction, feature selection, and classification to identify a set of acoustic features derived from voice recordings that exhibit strong MCI detection capability.

METHODS

The study included 100 MCI cases and 100 cognitively normal controls matched for age, sex, and education from the Framingham Heart Study. Participants' spoken responses on neuropsychological tests were recorded, and the recorded audio was processed to identify segments of each participant's voice from recordings that included voices of both testers and participants. A comprehensive set of 6385 acoustic features was then extracted from these voice segments using OpenSMILE and Praat software. Subsequently, a random forest model was constructed to classify cognitive status using the features that exhibited significant differences between the MCI and cognitively normal groups. The MCI detection performance of various audio lengths was further examined.

RESULTS

An optimal subset of 29 features was identified that resulted in an area under the receiver operating characteristic curve of 0.87, with a 95% CI of 0.81-0.94. The most important acoustic feature for MCI classification was the number of filled pauses (importance score=0.09, P=3.10E-08). There was no substantial difference in the performance of the model trained on the acoustic features derived from different lengths of voice recordings.

CONCLUSIONS

This study showcases the potential of monitoring changes to nonsemantic and acoustic features of speech as a way of early ADRD detection and motivates future opportunities for using human speech as a measure of brain health.

摘要

背景

随着全球人口老龄化以及阿尔茨海默病和相关痴呆症(ADRDs)负担的增加,人们越来越关注识别轻度认知障碍(MCI),以便及时进行干预,从而有可能延缓临床痴呆的发生。个体的言语产生是一项认知复杂的任务,涉及到各种认知领域。音频数据采集的便利性突出了使用人类言语作为认知评估工具的潜在成本效益和非侵入性。

目的

本研究旨在构建一个机器学习管道,该管道结合说话人分割、特征提取、特征选择和分类,以识别一组源自语音记录的声学特征,这些特征具有很强的 MCI 检测能力。

方法

该研究纳入了来自弗雷明汉心脏研究的 100 例 MCI 病例和 100 例年龄、性别和教育程度相匹配的认知正常对照者。参与者在神经心理学测试中的口语反应被记录下来,录制的音频被处理以识别包含测试者和参与者声音的录音中每个参与者声音的片段。然后,使用 OpenSMILE 和 Praat 软件从这些语音片段中提取了一套全面的 6385 个声学特征。随后,构建了一个随机森林模型,使用在 MCI 和认知正常组之间存在显著差异的特征来对认知状态进行分类。进一步检查了不同音频长度的 MCI 检测性能。

结果

确定了一个最优的 29 个特征子集,得到了 0.87 的受试者工作特征曲线下面积,95%置信区间为 0.81-0.94。用于 MCI 分类的最重要的声学特征是填充停顿的次数(重要性得分=0.09,P=3.10E-08)。在从不同长度的语音记录中提取的声学特征上训练的模型的性能没有显著差异。

结论

本研究展示了监测非语义和言语声学特征变化作为早期 ADRD 检测方法的潜力,并为未来将人类言语作为大脑健康衡量标准的机会提供了动力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ad0/11377909/7832d2c67d7e/aging_v7i1e55126_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验