基于视觉Transformer的医学图像分析进展：全面综述。

Advances in medical image analysis with vision Transformers: A comprehensive review.

作者信息

Azad Reza, Kazerouni Amirhossein, Heidari Moein, Aghdam Ehsan Khodapanah, Molaei Amirali, Jia Yiwei, Jose Abin, Roy Rijo, Merhof Dorit

机构信息

Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany.

School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran.

出版信息

Med Image Anal. 2024 Jan;91:103000. doi: 10.1016/j.media.2023.103000. Epub 2023 Oct 19.

DOI:10.1016/j.media.2023.103000

PMID:37883822

Abstract

The remarkable performance of the Transformer architecture in natural language processing has recently also triggered broad interest in Computer Vision. Among other merits, Transformers are witnessed as capable of learning long-range dependencies and spatial correlations, which is a clear advantage over convolutional neural networks (CNNs), which have been the de facto standard in Computer Vision problems so far. Thus, Transformers have become an integral part of modern medical image analysis. In this review, we provide an encyclopedic review of the applications of Transformers in medical imaging. Specifically, we present a systematic and thorough review of relevant recent Transformer literature for different medical image analysis tasks, including classification, segmentation, detection, registration, synthesis, and clinical report generation. For each of these applications, we investigate the novelty, strengths and weaknesses of the different proposed strategies and develop taxonomies highlighting key properties and contributions. Further, if applicable, we outline current benchmarks on different datasets. Finally, we summarize key challenges and discuss different future research directions. In addition, we have provided cited papers with their corresponding implementations in https://github.com/mindflow-institue/Awesome-Transformer.

摘要

Transformer架构在自然语言处理中的卓越表现最近也引发了计算机视觉领域的广泛关注。除其他优点外，Transformer被认为能够学习长程依赖和空间相关性，这相对于卷积神经网络（CNN）而言是一个明显的优势，而卷积神经网络在目前的计算机视觉问题中一直是事实上的标准。因此，Transformer已成为现代医学图像分析不可或缺的一部分。在这篇综述中，我们对Transformer在医学成像中的应用进行了全面的综述。具体而言，我们对近期不同医学图像分析任务（包括分类、分割、检测、配准、合成和临床报告生成）的相关Transformer文献进行了系统而深入的综述。对于这些应用中的每一个，我们研究了不同提出策略的新颖性、优点和缺点，并开发了突出关键特性和贡献的分类法。此外，在适用的情况下，我们概述了不同数据集上的当前基准。最后，我们总结了关键挑战并讨论了不同的未来研究方向。此外，我们在https://github.com/mindflow-institue/Awesome-Transformer中提供了引用论文及其相应的实现。