Suppr超能文献

在结构磁共振成像扫描上高效训练视觉Transformer用于阿尔茨海默病检测

Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection.

作者信息

Dhinagar Nikhil J, Thomopoulos Sophia I, Laltoo Emily, Thompson Paul M

出版信息

ArXiv. 2023 Mar 14:arXiv:2303.08216v1.

Abstract

Neuroimaging of large populations is valuable to identify factors that promote or resist brain disease, and to assist diagnosis, subtyping, and prognosis. Data-driven models such as convolutional neural networks (CNNs) have increasingly been applied to brain images to perform diagnostic and prognostic tasks by learning robust features. Vision transformers (ViT) - a new class of deep learning architectures - have emerged in recent years as an alternative to CNNs for several computer vision applications. Here we tested variants of the ViT architecture for a range of desired neuroimaging downstream tasks based on difficulty, in this case for sex and Alzheimer's disease (AD) classification based on 3D brain MRI. In our experiments, two vision transformer architecture variants achieved an AUC of 0.987 for sex and 0.892 for AD classification, respectively. We independently evaluated our models on data from two benchmark AD datasets. We achieved a performance boost of 5% and 9-10% upon fine-tuning vision transformer models pre-trained on synthetic (generated by a latent diffusion model) and real MRI scans, respectively. Our main contributions include testing the effects of different ViT training strategies including pre-training, data augmentation and learning rate warm-ups followed by annealing, as pertaining to the neuroimaging domain. These techniques are essential for training ViT-like models for neuroimaging applications where training data is usually limited. We also analyzed the effect of the amount of training data utilized on the test-time performance of the ViT via data-model scaling curves.

摘要

对大量人群进行神经成像,有助于识别促进或抵御脑部疾病的因素,并辅助诊断、亚型分类和预后判断。诸如卷积神经网络(CNN)等数据驱动模型已越来越多地应用于脑图像,通过学习强大的特征来执行诊断和预后任务。视觉Transformer(ViT)——一类新型的深度学习架构——近年来已成为CNN在多种计算机视觉应用中的替代方案。在此,我们基于难度对一系列所需的神经成像下游任务测试了ViT架构的变体,在本案例中是基于三维脑磁共振成像(MRI)进行性别和阿尔茨海默病(AD)分类。在我们的实验中,两种视觉Transformer架构变体在性别分类和AD分类上分别达到了0.987和0.892的曲线下面积(AUC)。我们在两个基准AD数据集的数据上独立评估了我们的模型。在分别对在合成(由潜在扩散模型生成)和真实MRI扫描上预训练的视觉Transformer模型进行微调后,我们的性能分别提升了5%和9 - 10%。我们的主要贡献包括测试不同的ViT训练策略的效果,包括预训练、数据增强以及先进行学习率热身然后退火,这些策略与神经成像领域相关。这些技术对于为神经成像应用训练类似ViT的模型至关重要,因为在神经成像应用中训练数据通常有限。我们还通过数据 - 模型缩放曲线分析了所使用的训练数据量对ViT测试时性能的影响。

相似文献

2
Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection.
Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-6. doi: 10.1109/EMBC40787.2023.10341190.
3
5
RT-ViT: Real-Time Monocular Depth Estimation Using Lightweight Vision Transformers.
Sensors (Basel). 2022 May 19;22(10):3849. doi: 10.3390/s22103849.
6
FibroVit-Vision transformer-based framework for detection and classification of pulmonary fibrosis from chest CT images.
Front Med (Lausanne). 2023 Nov 8;10:1282200. doi: 10.3389/fmed.2023.1282200. eCollection 2023.
7
Explainable Vision Transformer with Self-Supervised Learning to Predict Alzheimer's Disease Progression Using 18F-FDG PET.
Bioengineering (Basel). 2023 Oct 20;10(10):1225. doi: 10.3390/bioengineering10101225.
9

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验