Ali Hazrat, Qureshi Rizwan, Shah Zubair
College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar.
Department of Imaging Physics, MD Anderson Cancer Center, University of Texas, Houston, Houston, TX, United States.
JMIR Med Inform. 2023 Nov 17;11:e47445. doi: 10.2196/47445.
Transformer-based models are gaining popularity in medical imaging and cancer imaging applications. Many recent studies have demonstrated the use of transformer-based models for brain cancer imaging applications such as diagnosis and tumor segmentation.
This study aims to review how different vision transformers (ViTs) contributed to advancing brain cancer diagnosis and tumor segmentation using brain image data. This study examines the different architectures developed for enhancing the task of brain tumor segmentation. Furthermore, it explores how the ViT-based models augmented the performance of convolutional neural networks for brain cancer imaging.
This review performed the study search and study selection following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. The search comprised 4 popular scientific databases: PubMed, Scopus, IEEE Xplore, and Google Scholar. The search terms were formulated to cover the interventions (ie, ViTs) and the target application (ie, brain cancer imaging). The title and abstract for study selection were performed by 2 reviewers independently and validated by a third reviewer. Data extraction was performed by 2 reviewers and validated by a third reviewer. Finally, the data were synthesized using a narrative approach.
Of the 736 retrieved studies, 22 (3%) were included in this review. These studies were published in 2021 and 2022. The most commonly addressed task in these studies was tumor segmentation using ViTs. No study reported early detection of brain cancer. Among the different ViT architectures, Shifted Window transformer-based architectures have recently become the most popular choice of the research community. Among the included architectures, UNet transformer and TransUNet had the highest number of parameters and thus needed a cluster of as many as 8 graphics processing units for model training. The brain tumor segmentation challenge data set was the most popular data set used in the included studies. ViT was used in different combinations with convolutional neural networks to capture both the global and local context of the input brain imaging data.
It can be argued that the computational complexity of transformer architectures is a bottleneck in advancing the field and enabling clinical transformations. This review provides the current state of knowledge on the topic, and the findings of this review will be helpful for researchers in the field of medical artificial intelligence and its applications in brain cancer.
基于Transformer的模型在医学成像和癌症成像应用中越来越受欢迎。最近的许多研究都展示了基于Transformer的模型在脑癌成像应用中的使用,如诊断和肿瘤分割。
本研究旨在回顾不同的视觉Transformer(ViT)如何利用脑图像数据推动脑癌诊断和肿瘤分割的发展。本研究考察了为增强脑肿瘤分割任务而开发的不同架构。此外,还探讨了基于ViT的模型如何提升卷积神经网络在脑癌成像方面的性能。
本综述按照PRISMA-ScR(系统评价和Meta分析扩展版的范围综述的首选报告项目)指南进行研究检索和研究选择。检索涵盖4个流行的科学数据库:PubMed、Scopus、IEEE Xplore和谷歌学术。检索词的制定旨在涵盖干预措施(即ViT)和目标应用(即脑癌成像)。研究选择的标题和摘要由2名评审员独立进行,并由第3名评审员进行验证。数据提取由2名评审员进行,并由第3名评审员进行验证。最后,采用叙述性方法对数据进行综合。
在检索到的736项研究中,22项(3%)被纳入本综述。这些研究发表于2021年和2022年。这些研究中最常涉及的任务是使用ViT进行肿瘤分割。没有研究报告脑癌的早期检测。在不同的ViT架构中,基于移位窗口Transformer的架构最近成为研究界最受欢迎的选择。在所纳入的架构中,UNet Transformer和TransUNet的参数数量最多,因此模型训练需要多达8个图形处理单元的集群。脑肿瘤分割挑战数据集是纳入研究中使用最广泛的数据集。ViT与卷积神经网络以不同组合使用,以捕捉输入脑成像数据的全局和局部上下文。
可以认为,Transformer架构的计算复杂性是推动该领域发展并实现临床转化的一个瓶颈。本综述提供了该主题的当前知识状态,本综述的结果将有助于医学人工智能领域的研究人员及其在脑癌中的应用。