Suppr超能文献

基于自发语音的文本转录与声学特征的时间整合用于阿尔茨海默病诊断

Temporal Integration of Text Transcripts and Acoustic Features for Alzheimer's Diagnosis Based on Spontaneous Speech.

作者信息

Martinc Matej, Haider Fasih, Pollak Senja, Luz Saturnino

机构信息

Department of Knowledge Technologies, Jozef Stefan Institute, Ljubljana, Slovenia.

Usher Institute, Edinburgh Medical School, The University of Edinburgh, Edinburgh, United Kingdom.

出版信息

Front Aging Neurosci. 2021 Jun 14;13:642647. doi: 10.3389/fnagi.2021.642647. eCollection 2021.

Abstract

Advances in machine learning (ML) technology have opened new avenues for detection and monitoring of cognitive decline. In this study, a multimodal approach to Alzheimer's dementia detection based on the patient's spontaneous speech is presented. This approach was tested on a standard, publicly available Alzheimer's speech dataset for comparability. The data comprise voice samples from 156 participants (1:1 ratio of Alzheimer's to control), matched by age and gender. A recently developed Active Data Representation (ADR) technique for voice processing was employed as a framework for fusion of acoustic and textual features at sentence and word level. Temporal aspects of textual features were investigated in conjunction with acoustic features in order to shed light on the temporal interplay between paralinguistic (acoustic) and linguistic (textual) aspects of Alzheimer's speech. Combinations between several configurations of ADR features and more traditional bag-of-n-grams approaches were used in an ensemble of classifiers built and evaluated on a standardised dataset containing recorded speech of scene descriptions and textual transcripts. Employing only semantic bag-of-n-grams features, an accuracy of 89.58% was achieved in distinguishing between Alzheimer's patients and healthy controls. Adding temporal and structural information by combining bag-of-n-grams features with ADR audio/textual features, the accuracy could be improved to 91.67% on the test set. An accuracy of 93.75% was achieved through late fusion of the three best feature configurations, which corresponds to a 4.7% improvement over the best result reported in the literature for this dataset. The proposed combination of ADR audio and textual features is capable of successfully modelling temporal aspects of the data. The machine learning approach toward dementia detection achieves best performance when ADR features are combined with strong semantic bag-of-n-grams features. This combination leads to state-of-the-art performance on the AD classification task.

摘要

机器学习(ML)技术的进步为认知能力下降的检测和监测开辟了新途径。在本研究中,提出了一种基于患者自发言语的阿尔茨海默病痴呆检测多模态方法。为了便于比较,该方法在一个标准的、公开可用的阿尔茨海默病言语数据集上进行了测试。数据包括156名参与者的语音样本(阿尔茨海默病患者与对照组比例为1:1),并按年龄和性别进行匹配。一种最近开发的用于语音处理的主动数据表示(ADR)技术被用作句子和单词层面声学和文本特征融合的框架。结合声学特征对文本特征的时间方面进行了研究,以阐明阿尔茨海默病言语中副语言(声学)和语言(文本)方面之间的时间相互作用。在一个基于包含场景描述录音语音和文本转录本的标准化数据集构建和评估的分类器集合中,使用了ADR特征的几种配置与更传统的n元语法袋方法之间的组合。仅使用语义n元语法袋特征,在区分阿尔茨海默病患者和健康对照方面达到了89.58%的准确率。通过将n元语法袋特征与ADR音频/文本特征相结合来添加时间和结构信息,测试集上的准确率可提高到91.67%。通过对三种最佳特征配置进行后期融合,达到了93.75%的准确率,这比该数据集文献中报道的最佳结果提高了4.7%。所提出的ADR音频和文本特征组合能够成功地对数据的时间方面进行建模。当ADR特征与强大的语义n元语法袋特征相结合时,用于痴呆检测的机器学习方法实现了最佳性能。这种组合在AD分类任务上达到了当前的先进性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a68e/8236853/19fedd8b2db2/fnagi-13-642647-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验