利用视觉语言模型的图像-文本相似性，通过图神经网络进行阿尔茨海默病识别。

Alzheimer's disease recognition using graph neural network by leveraging image-text similarity from vision language model.

作者信息

Lee Byounghwa, Bang Jeong-Uk, Song Hwa Jeon, Kang Byung Ok

机构信息

Integrated Intelligence Research Section, Electronics and Telecommunications Research Institute, Daejeon, 34129, Republic of Korea.

出版信息

Sci Rep. 2025 Jan 6;15(1):997. doi: 10.1038/s41598-024-82597-z.

DOI:10.1038/s41598-024-82597-z

PMID:39762277

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11704039/

Abstract

Alzheimer's disease (AD), a progressive neurodegenerative condition, notably impacts cognitive functions and daily activity. One method of detecting dementia involves a task where participants describe a given picture, and extensive research has been conducted using the participants' speech and transcribed text. However, very few studies have explored the modality of the image itself. In this work, we propose a method that predicts dementia automatically by representing the relationship between images and texts as a graph. First, we transcribe the participants' speech into text using an automatic speech recognition system. Then, we employ a vision language model to represent the relationship between the parts of the image and the corresponding descriptive sentences as a bipartite graph. Finally, we use a graph convolutional network (GCN), considering each subject as an individual graph, to classify AD patients through a graph-level classification task. In experiments conducted on the ADReSSo Challenge datasets, our model surpassed the existing state-of-the-art performance by achieving an accuracy of 88.73%. Additionally, ablation studies that removed the relationship between images and texts demonstrated the critical role of graphs in improving performance. Furthermore, by utilizing the sentence representations learned through the GCN, we identified the sentences and keywords critical for AD classification.

摘要

阿尔茨海默病（AD）是一种进行性神经退行性疾病，对认知功能和日常活动有显著影响。一种检测痴呆症的方法涉及一项任务，即让参与者描述给定的图片，并且已经使用参与者的语音和转录文本进行了广泛研究。然而，很少有研究探索图像本身的模态。在这项工作中，我们提出了一种方法，通过将图像与文本之间的关系表示为图来自动预测痴呆症。首先，我们使用自动语音识别系统将参与者的语音转录为文本。然后，我们使用视觉语言模型将图像的各个部分与相应的描述性句子之间的关系表示为二分图。最后，我们使用图卷积网络（GCN），将每个受试者视为一个单独的图，通过图级分类任务对AD患者进行分类。在针对ADReSSo挑战数据集进行的实验中，我们的模型达到了88.73%的准确率，超过了现有的最先进性能。此外，去除图像与文本之间关系的消融研究证明了图在提高性能方面的关键作用。此外，通过利用通过GCN学习到的句子表示，我们确定了对AD分类至关重要的句子和关键词。

相似文献

Alzheimer's disease recognition using graph neural network by leveraging image-text similarity from vision language model.利用视觉语言模型的图像-文本相似性，通过图神经网络进行阿尔茨海默病识别。

Sci Rep. 2025 Jan 6;15(1):997. doi: 10.1038/s41598-024-82597-z.

Multimodal feature fusion-based graph convolutional networks for Alzheimer's disease stage classification using F-18 florbetaben brain PET images and clinical indicators.基于多模态特征融合的图卷积网络用于使用F-18氟贝他班脑PET图像和临床指标的阿尔茨海默病阶段分类

PLoS One. 2024 Dec 23;19(12):e0315809. doi: 10.1371/journal.pone.0315809. eCollection 2024.

Hi-GCN: A hierarchical graph convolution network for graph embedding learning of brain network and brain disorders prediction.Hi-GCN：一种用于脑网络图嵌入学习和脑疾病预测的层次图卷积网络。

Comput Biol Med. 2020 Dec;127:104096. doi: 10.1016/j.compbiomed.2020.104096. Epub 2020 Nov 3.

A Graph Convolutional Network Based on Univariate Neurodegeneration Biomarker for Alzheimer's Disease Diagnosis.基于单变量神经退行性生物标志物的用于阿尔茨海默病诊断的图卷积网络。

IEEE J Transl Eng Health Med. 2023 Jun 13;11:405-416. doi: 10.1109/JTEHM.2023.3285723. eCollection 2023.

Task-radMBNet: An Improved Task-Driven Dynamic Graph Sparsity Pattern Radiomics-Based Morphological Brain Network for Alzheimer's Disease Characterization.

Brain Connect. 2025 Apr;15(3):139-149. doi: 10.1089/brain.2024.0053. Epub 2025 Apr 8.

[Coupled convolutional and graph network-based diagnosis of Alzheimer's disease using MRI].[基于耦合卷积和图网络的磁共振成像对阿尔茨海默病的诊断]

Nan Fang Yi Ke Da Xue Xue Bao. 2020 Apr 30;40(4):531-537. doi: 10.12122/j.issn.1673-4254.2020.04.13.

Graph Convolutional Network for AD and MCI Diagnosis Utilizing Peripheral DNA Methylation: Réseau de neurones en graphes pour le diagnostic de la MA et du TCL à l'aide de la méthylation de l'ADN périphérique.利用外周血DNA甲基化的阿尔茨海默病和轻度认知障碍诊断的图卷积网络：使用外周血DNA甲基化进行阿尔茨海默病和轻度认知障碍诊断的图神经网络

Can J Psychiatry. 2024 Dec;69(12):869-879. doi: 10.1177/07067437241300947. Epub 2024 Nov 25.

Early prediction of dementia using fMRI data with a graph convolutional network approach.利用图卷积网络方法从 fMRI 数据中早期预测痴呆症。

J Neural Eng. 2024 Jan 29;21(1). doi: 10.1088/1741-2552/ad1e22.

MVS-GCN: A prior brain structure learning-guided multi-view graph convolution network for autism spectrum disorder diagnosis.MVS-GCN：一种基于先验脑结构学习的多视图图卷积网络自闭症谱系障碍诊断方法。

Comput Biol Med. 2022 Mar;142:105239. doi: 10.1016/j.compbiomed.2022.105239. Epub 2022 Jan 19.

An Alzheimer's Disease classification network based on MRI utilizing diffusion maps for multi-scale feature fusion in graph convolution.基于 MRI 的阿尔茨海默病分类网络，利用扩散图进行图卷积中的多尺度特征融合。

Math Biosci Eng. 2024 Jan;21(1):1554-1572. doi: 10.3934/mbe.2024067. Epub 2022 Dec 29.

引用本文的文献

Multimodal Alzheimer's disease recognition from image, text and audio.基于图像、文本和音频的多模态阿尔茨海默病识别

Sci Rep. 2025 Aug 8;15(1):29038. doi: 10.1038/s41598-025-14998-7.

Deep ensemble learning with transformer models for enhanced Alzheimer's disease detection.基于Transformer模型的深度集成学习用于增强阿尔茨海默病检测

Sci Rep. 2025 Jul 9;15(1):24720. doi: 10.1038/s41598-025-08362-y.

Advances in gait research related to Alzheimer's disease.与阿尔茨海默病相关的步态研究进展。

Front Neurol. 2025 Jun 3;16:1548283. doi: 10.3389/fneur.2025.1548283. eCollection 2025.

本文引用的文献

The Case of the Cookie Jar: Differences in Typical Language Use in Dementia.饼干罐里的病例：痴呆症中典型语言使用的差异。

J Alzheimers Dis. 2024;100(4):1417-1434. doi: 10.3233/JAD-230844.

Lexical-semantic properties of verbs and nouns used in conversation by people with Alzheimer's disease.阿尔茨海默病患者会话中使用的动词和名词的词汇语义特征。

PLoS One. 2023 Aug 3;18(8):e0288556. doi: 10.1371/journal.pone.0288556. eCollection 2023.

WavBERT: Exploiting Semantic and Non-semantic Speech using Wav2vec and BERT for Dementia Detection.WavBERT：利用Wav2vec和BERT中的语义和非语义语音进行痴呆症检测。

Interspeech. 2021 Aug-Sep;2021:3790-3794. doi: 10.21437/interspeech.2021-332.

DementiaBank: Theoretical Rationale, Protocol, and Illustrative Analyses.痴呆症数据库：理论基础、方案及实例分析。

Am J Speech Lang Pathol. 2023 Mar 9;32(2):426-438. doi: 10.1044/2022_AJSLP-22-00281. Epub 2023 Feb 15.

Automatic Detection of Alzheimer's Disease Using Spontaneous Speech Only.仅使用自发语音自动检测阿尔茨海默病。

Interspeech. 2021 Aug-Sep;2021:3830-3834. doi: 10.21437/interspeech.2021-2002.

Speech- and Language-Based Classification of Alzheimer's Disease: A Systematic Review.基于言语和语言的阿尔茨海默病分类：一项系统综述。

Bioengineering (Basel). 2022 Jan 11;9(1):27. doi: 10.3390/bioengineering9010027.

Spatio-Semantic Graphs From Picture Description: Applications to Detection of Cognitive Impairment.基于图片描述的时空语义图：在认知障碍检测中的应用

Front Neurol. 2021 Dec 9;12:795374. doi: 10.3389/fneur.2021.795374. eCollection 2021.

Automatic dementia screening and scoring by applying deep learning on clock-drawing tests.应用深度学习对画钟测验进行自动痴呆筛查和评分。

Sci Rep. 2020 Nov 30;10(1):20854. doi: 10.1038/s41598-020-74710-9.

Artificial Intelligence, Speech, and Language Processing Approaches to Monitoring Alzheimer's Disease: A Systematic Review.人工智能、语音和语言处理方法在阿尔茨海默病监测中的应用：系统综述。

J Alzheimers Dis. 2020;78(4):1547-1574. doi: 10.3233/JAD-200888.

The graph neural network model.图神经网络模型。

IEEE Trans Neural Netw. 2009 Jan;20(1):61-80. doi: 10.1109/TNN.2008.2005605. Epub 2008 Dec 9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用视觉语言模型的图像-文本相似性，通过图神经网络进行阿尔茨海默病识别。

Alzheimer's disease recognition using graph neural network by leveraging image-text similarity from vision language model.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献