Suppr超能文献

基于交叉注意力的多分辨率特征融合模型用于自监督宫颈光学相干断层扫描图像分类

Cross-Attention Based Multi-Resolution Feature Fusion Model for Self-Supervised Cervical OCT Image Classification.

作者信息

Wang Qingbin, Chen Kaiyi, Dou Wanrong, Ma Yutao

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2023 Jul-Aug;20(4):2541-2554. doi: 10.1109/TCBB.2023.3246979. Epub 2023 Aug 9.

Abstract

Cervical cancer seriously endangers the health of the female reproductive system and even risks women's life in severe cases. Optical coherence tomography (OCT) is a non-invasive, real-time, high-resolution imaging technology for cervical tissues. However, since the interpretation of cervical OCT images is a knowledge-intensive, time-consuming task, it is tough to acquire a large number of high-quality labeled images quickly, which is a big challenge for supervised learning. In this study, we introduce the vision Transformer (ViT) architecture, which has recently achieved impressive results in natural image analysis, into the classification task of cervical OCT images. Our work aims to develop a computer-aided diagnosis (CADx) approach based on a self-supervised ViT-based model to classify cervical OCT images effectively. We leverage masked autoencoders (MAE) to perform self-supervised pre-training on cervical OCT images, so the proposed classification model has a better transfer learning ability. In the fine-tuning process, the ViT-based classification model extracts multi-scale features from OCT images of different resolutions and fuses them with the cross-attention module. The ten-fold cross-validation results on an OCT image dataset from a multi-center clinical study of 733 patients in China indicate that our model achieved an AUC value of 0.9963 ± 0.0069 with a 95.89 ± 3.30% sensitivity and 98.23 ± 1.36 % specificity, outperforming some state-of-the-art classification models based on Transformers and convolutional neural networks (CNNs) in the binary classification task of detecting high-risk cervical diseases, including high-grade squamous intraepithelial lesion (HSIL) and cervical cancer. Furthermore, our model with the cross-shaped voting strategy achieved a sensitivity of 92.06% and specificity of 95.56% on an external validation dataset containing 288 three-dimensional (3D) OCT volumes from 118 Chinese patients in a different new hospital. This result met or exceeded the average of four medical experts who have used OCT for over one year. In addition to promising classification performance, our model has a remarkable ability to detect and visualize local lesions using the attention map of the standard ViT model, providing good interpretability for gynecologists to locate and diagnose possible cervical diseases.

摘要

宫颈癌严重危及女性生殖系统健康,严重时甚至会危及生命。光学相干断层扫描(OCT)是一种用于宫颈组织的非侵入性、实时、高分辨率成像技术。然而,由于宫颈OCT图像的解读是一项知识密集型、耗时的任务,很难快速获取大量高质量的标注图像,这对监督学习来说是一个巨大挑战。在本研究中,我们将最近在自然图像分析中取得显著成果的视觉Transformer(ViT)架构引入宫颈OCT图像的分类任务。我们的工作旨在开发一种基于自监督ViT模型的计算机辅助诊断(CADx)方法,以有效分类宫颈OCT图像。我们利用掩码自动编码器(MAE)对宫颈OCT图像进行自监督预训练,因此所提出的分类模型具有更好的迁移学习能力。在微调过程中,基于ViT的分类模型从不同分辨率的OCT图像中提取多尺度特征,并通过交叉注意力模块将它们融合。在中国对733名患者进行的多中心临床研究的OCT图像数据集上进行的十折交叉验证结果表明,我们的模型在检测包括高级别鳞状上皮内病变(HSIL)和宫颈癌在内 的高危宫颈疾病的二分类任务中,AUC值达到0.9963±0.0069,灵敏度为95.89±3.30%,特异性为98.23±1.36%,优于一些基于Transformer和卷积神经网络(CNN)的先进分类模型。此外,我们的模型采用十字投票策略,在一个包含来自另一家新医院118名中国患者的288个三维(3D)OCT容积的外部验证数据集上,灵敏度达到92.06%,特异性达到95.56%。这一结果达到或超过了四位使用OCT一年以上的医学专家的平均水平。除了具有出色的分类性能外,我们的模型还具有利用标准ViT模型注意力图检测和可视化局部病变的显著能力,为妇科医生定位和诊断可能的宫颈疾病提供了良好的可解释性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验