Lin Honghuang, Karjadi Cody, Ang Ting F A, Prajakta Joshi, McManus Chelsea, Alhanai Tuka W, Glass James, Au Rhoda
Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA.
The Framingham Heart Study, Boston University School of Medicine, Boston, MA 02118, USA.
Explor Med. 2020;1:406-417. doi: 10.37349/emed.2020.00028. Epub 2020 Dec 31.
Human voice contains rich information. Few longitudinal studies have been conducted to investigate the potential of voice to monitor cognitive health. The objective of this study is to identify voice biomarkers that are predictive of future dementia.
Participants were recruited from the Framingham Heart Study. The vocal responses to neuropsychological tests were recorded, which were then diarized to identify participant voice segments. Acoustic features were extracted with the OpenSMILE toolkit (v2.1). The association of each acoustic feature with incident dementia was assessed by Cox proportional hazards models.
Our study included 6, 528 voice recordings from 4, 849 participants (mean age 63 ± 15 years old, 54.6% women). The majority of participants (71.2%) had one voice recording, 23.9% had two voice recordings, and the remaining participants (4.9%) had three or more voice recordings. Although all asymptomatic at the time of examination, participants who developed dementia tended to have shorter segments than those who were dementia free ( < 0.001). Additionally, 14 acoustic features were significantly associated with dementia after adjusting for multiple testing ( < 0.05/48 = 1 × 10). The most significant acoustic feature was jitterDDP_sma_de ( = 7.9 × 10), which represents the differential frame-to-frame Jitter. A voice based linear classifier was also built that was capable of predicting incident dementia with area under curve of 0.812.
Multiple acoustic and linguistic features are identified that are associated with incident dementia among asymptomatic participants, which could be used to build better prediction models for passive cognitive health monitoring.
人类声音包含丰富信息。很少有纵向研究调查声音监测认知健康的潜力。本研究的目的是识别可预测未来痴呆症的声音生物标志物。
参与者从弗雷明汉心脏研究中招募。记录对神经心理学测试的声音反应,然后进行日记化以识别参与者的声音片段。使用OpenSMILE工具包(v2.1)提取声学特征。通过Cox比例风险模型评估每个声学特征与新发痴呆症的关联。
我们的研究包括来自4849名参与者的6528份声音记录(平均年龄63±15岁,54.6%为女性)。大多数参与者(71.2%)有一份声音记录,23.9%有两份声音记录,其余参与者(4.9%)有三份或更多声音记录。尽管在检查时所有参与者均无症状,但患痴呆症的参与者的声音片段往往比未患痴呆症的参与者短(<0.001)。此外,在进行多重检验校正后,14个声学特征与痴呆症显著相关(<0.05/48 = 1×10)。最显著的声学特征是jitterDDP_sma_de(= 7.9×10),它代表逐帧抖动差异。还构建了一个基于声音的线性分类器,其能够以0.812的曲线下面积预测新发痴呆症。
在无症状参与者中识别出多个与新发痴呆症相关的声学和语言特征,这些特征可用于构建更好的被动认知健康监测预测模型。