Almalki Hassan, Khadidos Alaa O, Alhebaishi Nawaf, Senan Ebrahim Mohammed
Department of Information Technology, College of Technology for Communications and Information, Technical and Vocational training Corporation, Riyadh, Saudi Arabia.
Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia.
Sci Rep. 2025 May 14;15(1):16799. doi: 10.1038/s41598-025-01072-5.
Alzheimer's disease (AD) is a neurodegenerative disorder that affects memory and cognitive functions. Manual diagnosis is prone to human error, often leading to misdiagnosis or delayed detection. MRI techniques help visualize the fine tissues of the brain cells, indicating the stage of disease progression. Artificial intelligence techniques analyze MRI with high accuracy and extract subtle features that are difficult to diagnose manually. In this study, a modern methodology was designed that combines the power of CNN models (ResNet101 and GoogLeNet) to extract local deep features and the power of Vision Transformer (ViT) models to extract global features and find relationships between image spots. First, the MRI images of the Open Access Imaging Studies Series (OASIS) dataset were improved by two filters: the adaptive median filter (AMF) and Laplacian filter. The ResNet101 and GoogLeNet models were modified to suit the feature extraction task and reduce computational cost. The ViT architecture was modified to reduce the computational cost while increasing the number of attention vertices to further discover global features and relationships between image patches. The enhanced images were fed into the proposed ViT-CNN methodology. The enhanced images were fed to the modified ResNet101 and GoogLeNet models to extract the deep feature maps with high accuracy. Deep feature maps were fed into the modified ViT model. The deep feature maps were partitioned into 32 feature maps using ResNet101 and 16 feature maps using GoogLeNet, both with a size of 64 features. The feature maps were encoded to recognize the spatial arrangement of the patch and preserve the relationship between patches, helping the self-attention layers distinguish between patches based on their positions. They were fed to the transformer encoder, which consisted of six blocks and multiple vertices to focus on different patterns or regions simultaneously. Finally, the MLP classification layers classify each image into one of four dataset classes. The improved ResNet101-ViT hybrid methodology outperformed the GoogLeNet-ViT hybrid methodology. ResNet101-ViT achieved 98.7% accuracy, 95.05% AUC, 96.45% precision, 99.68% sensitivity, and 97.78% specificity.
阿尔茨海默病(AD)是一种影响记忆和认知功能的神经退行性疾病。人工诊断容易出现人为错误,常常导致误诊或检测延迟。磁共振成像(MRI)技术有助于可视化脑细胞的精细组织,显示疾病进展阶段。人工智能技术能够高精度地分析MRI,并提取难以通过人工诊断的细微特征。在本研究中,设计了一种现代方法,该方法结合了卷积神经网络(CNN)模型(ResNet101和GoogLeNet)提取局部深度特征的能力和视觉Transformer(ViT)模型提取全局特征并找到图像斑点之间关系的能力。首先,通过自适应中值滤波器(AMF)和拉普拉斯滤波器这两种滤波器对开放获取成像研究系列(OASIS)数据集的MRI图像进行改进。对ResNet101和GoogLeNet模型进行修改,以适应特征提取任务并降低计算成本。对ViT架构进行修改,以降低计算成本,同时增加注意力顶点的数量,以进一步发现全局特征和图像块之间的关系。将增强后的图像输入到所提出的ViT-CNN方法中。将增强后的图像输入到修改后的ResNet101和GoogLeNet模型中,以高精度提取深度特征图。将深度特征图输入到修改后的ViT模型中。使用ResNet101将深度特征图划分为32个特征图,使用GoogLeNet将其划分为16个特征图,两者大小均为64个特征。对特征图进行编码,以识别图像块的空间排列并保留图像块之间的关系,帮助自注意力层根据其位置区分图像块。将它们输入到由六个块和多个顶点组成的Transformer编码器中,以便同时关注不同的模式或区域。最后,多层感知器(MLP)分类层将每个图像分类为四个数据集类别之一。改进后的ResNet101-ViT混合方法优于GoogLeNet-ViT混合方法。ResNet101-ViT的准确率达到98.7%,曲线下面积(AUC)为95.05%,精确率为96.45%,灵敏度为99.68%,特异性为97.78%。