基于交叉注意力的多分辨率特征融合模型用于自监督宫颈光学相干断层扫描图像分类

Cross-Attention Based Multi-Resolution Feature Fusion Model for Self-Supervised Cervical OCT Image Classification.

作者信息

Wang Qingbin, Chen Kaiyi, Dou Wanrong, Ma Yutao

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2023 Jul-Aug;20(4):2541-2554. doi: 10.1109/TCBB.2023.3246979. Epub 2023 Aug 9.

DOI:10.1109/TCBB.2023.3246979

Abstract

Cervical cancer seriously endangers the health of the female reproductive system and even risks women's life in severe cases. Optical coherence tomography (OCT) is a non-invasive, real-time, high-resolution imaging technology for cervical tissues. However, since the interpretation of cervical OCT images is a knowledge-intensive, time-consuming task, it is tough to acquire a large number of high-quality labeled images quickly, which is a big challenge for supervised learning. In this study, we introduce the vision Transformer (ViT) architecture, which has recently achieved impressive results in natural image analysis, into the classification task of cervical OCT images. Our work aims to develop a computer-aided diagnosis (CADx) approach based on a self-supervised ViT-based model to classify cervical OCT images effectively. We leverage masked autoencoders (MAE) to perform self-supervised pre-training on cervical OCT images, so the proposed classification model has a better transfer learning ability. In the fine-tuning process, the ViT-based classification model extracts multi-scale features from OCT images of different resolutions and fuses them with the cross-attention module. The ten-fold cross-validation results on an OCT image dataset from a multi-center clinical study of 733 patients in China indicate that our model achieved an AUC value of 0.9963 ± 0.0069 with a 95.89 ± 3.30% sensitivity and 98.23 ± 1.36 % specificity, outperforming some state-of-the-art classification models based on Transformers and convolutional neural networks (CNNs) in the binary classification task of detecting high-risk cervical diseases, including high-grade squamous intraepithelial lesion (HSIL) and cervical cancer. Furthermore, our model with the cross-shaped voting strategy achieved a sensitivity of 92.06% and specificity of 95.56% on an external validation dataset containing 288 three-dimensional (3D) OCT volumes from 118 Chinese patients in a different new hospital. This result met or exceeded the average of four medical experts who have used OCT for over one year. In addition to promising classification performance, our model has a remarkable ability to detect and visualize local lesions using the attention map of the standard ViT model, providing good interpretability for gynecologists to locate and diagnose possible cervical diseases.

摘要

宫颈癌严重危及女性生殖系统健康，严重时甚至会危及生命。光学相干断层扫描（OCT）是一种用于宫颈组织的非侵入性、实时、高分辨率成像技术。然而，由于宫颈OCT图像的解读是一项知识密集型、耗时的任务，很难快速获取大量高质量的标注图像，这对监督学习来说是一个巨大挑战。在本研究中，我们将最近在自然图像分析中取得显著成果的视觉Transformer（ViT）架构引入宫颈OCT图像的分类任务。我们的工作旨在开发一种基于自监督ViT模型的计算机辅助诊断（CADx）方法，以有效分类宫颈OCT图像。我们利用掩码自动编码器（MAE）对宫颈OCT图像进行自监督预训练，因此所提出的分类模型具有更好的迁移学习能力。在微调过程中，基于ViT的分类模型从不同分辨率的OCT图像中提取多尺度特征，并通过交叉注意力模块将它们融合。在中国对733名患者进行的多中心临床研究的OCT图像数据集上进行的十折交叉验证结果表明，我们的模型在检测包括高级别鳞状上皮内病变（HSIL）和宫颈癌在内的高危宫颈疾病的二分类任务中，AUC值达到0.9963±0.0069，灵敏度为95.89±3.30%，特异性为98.23±1.36%，优于一些基于Transformer和卷积神经网络（CNN）的先进分类模型。此外，我们的模型采用十字投票策略，在一个包含来自另一家新医院118名中国患者的288个三维（3D）OCT容积的外部验证数据集上，灵敏度达到92.06%，特异性达到95.56%。这一结果达到或超过了四位使用OCT一年以上的医学专家的平均水平。除了具有出色的分类性能外，我们的模型还具有利用标准ViT模型注意力图检测和可视化局部病变的显著能力，为妇科医生定位和诊断可能的宫颈疾病提供了良好的可解释性。

相似文献

Cross-Attention Based Multi-Resolution Feature Fusion Model for Self-Supervised Cervical OCT Image Classification.基于交叉注意力的多分辨率特征融合模型用于自监督宫颈光学相干断层扫描图像分类

IEEE/ACM Trans Comput Biol Bioinform. 2023 Jul-Aug;20(4):2541-2554. doi: 10.1109/TCBB.2023.3246979. Epub 2023 Aug 9.

Cervical optical coherence tomography image classification based on contrastive self-supervised texture learning.基于对比自监督纹理学习的宫颈光学相干断层成像图像分类。

Med Phys. 2022 Jun;49(6):3638-3653. doi: 10.1002/mp.15630. Epub 2022 Apr 13.

HTC-retina: A hybrid retinal diseases classification model using transformer-Convolutional Neural Network from optical coherence tomography images.HTC-retina：一种使用来自光学相干断层扫描图像的变压器-卷积神经网络的混合视网膜疾病分类模型。

Comput Biol Med. 2024 Aug;178:108726. doi: 10.1016/j.compbiomed.2024.108726. Epub 2024 Jun 9.

A Deep Learning Model for Cervical Optical Coherence Tomography Image Classification.用于宫颈光学相干断层扫描图像分类的深度学习模型

Diagnostics (Basel). 2024 Sep 11;14(18):2009. doi: 10.3390/diagnostics14182009.

Seeking an optimal approach for Computer-aided Diagnosis of Pulmonary Embolism.寻求肺栓塞计算机辅助诊断的最佳方法。

Med Image Anal. 2024 Jan;91:102988. doi: 10.1016/j.media.2023.102988. Epub 2023 Oct 13.

Computer-Aided Diagnosis of Label-Free 3-D Optical Coherence Microscopy Images of Human Cervical Tissue.基于无标记三维共焦光学显微镜的人宫颈组织图像的计算机辅助诊断

IEEE Trans Biomed Eng. 2019 Sep;66(9):2447-2456. doi: 10.1109/TBME.2018.2890167. Epub 2019 Jan 1.

Stitched vision transformer for age-related macular degeneration detection using retinal optical coherence tomography images.基于视网膜光学相干断层扫描图像的老年性黄斑变性检测用缝合视觉Transformer。

PLoS One. 2024 Jun 5;19(6):e0304943. doi: 10.1371/journal.pone.0304943. eCollection 2024.

SwinUNeCCt: bidirectional hash-based agent transformer for cervical cancer MRI image multi-task learning.SwinUNeCCt：基于双向哈希的代理转换器，用于宫颈癌 MRI 图像的多任务学习。

Sci Rep. 2024 Oct 19;14(1):24621. doi: 10.1038/s41598-024-75544-5.

Automatic diagnosis of macular diseases from OCT volume based on its two-dimensional feature map and convolutional neural network with attention mechanism.基于二维特征图和具有注意力机制的卷积神经网络的 OCT 容积自动黄斑疾病诊断。

J Biomed Opt. 2020 Sep;25(9). doi: 10.1117/1.JBO.25.9.096004.

Point based weakly semi-supervised biomarker detection with cross-scale and label assignment in retinal OCT images.基于点的视网膜 OCT 图像跨尺度和标签分配弱半监督生物标志物检测。

Comput Methods Programs Biomed. 2024 Jun;251:108229. doi: 10.1016/j.cmpb.2024.108229. Epub 2024 May 15.

引用本文的文献

Deep multi-task learning framework for gastrointestinal lesion-aided diagnosis and severity estimation.用于胃肠道病变辅助诊断和严重程度评估的深度多任务学习框架

Sci Rep. 2025 Jul 16;15(1):25827. doi: 10.1038/s41598-025-09587-7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于交叉注意力的多分辨率特征融合模型用于自监督宫颈光学相干断层扫描图像分类

Cross-Attention Based Multi-Resolution Feature Fusion Model for Self-Supervised Cervical OCT Image Classification.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献