基于瓶颈融合 Transformer 的共享特定特征学习在多模态全切片图像分析中的应用。

Shared-Specific Feature Learning With Bottleneck Fusion Transformer for Multi-Modal Whole Slide Image Analysis.

出版信息

IEEE Trans Med Imaging. 2023 Nov;42(11):3374-3383. doi: 10.1109/TMI.2023.3287256. Epub 2023 Oct 27.

DOI:10.1109/TMI.2023.3287256

Abstract

The fusion of multi-modal medical data is essential to assist medical experts to make treatment decisions for precision medicine. For example, combining the whole slide histopathological images (WSIs) and tabular clinical data can more accurately predict the lymph node metastasis (LNM) of papillary thyroid carcinoma before surgery to avoid unnecessary lymph node resection. However, the huge-sized WSI provides much more high-dimensional information than low-dimensional tabular clinical data, making the information alignment challenging in the multi-modal WSI analysis tasks. This paper presents a novel transformer-guided multi-modal multi-instance learning framework to predict lymph node metastasis from both WSIs and tabular clinical data. We first propose an effective multi-instance grouping scheme, named siamese attention-based feature grouping (SAG), to group high-dimensional WSIs into representative low-dimensional feature embeddings for fusion. We then design a novel bottleneck shared-specific feature transfer module (BSFT) to explore the shared and specific features between different modalities, where a few learnable bottleneck tokens are utilized for knowledge transfer between modalities. Moreover, a modal adaptation and orthogonal projection scheme were incorporated to further encourage BSFT to learn shared and specific features from multi-modal data. Finally, the shared and specific features are dynamically aggregated via an attention mechanism for slide-level prediction. Experimental results on our collected lymph node metastasis dataset demonstrate the efficiency of our proposed components and our framework achieves the best performance with AUC (area under the curve) of 97.34%, outperforming the state-of-the-art methods by over 1.27%.

摘要

多模态医学数据的融合对于辅助医学专家做出精准医疗治疗决策至关重要。例如，将全切片组织病理学图像（WSI）和表格临床数据相结合，可以更准确地预测甲状腺乳头状癌手术前的淋巴结转移（LNM），从而避免不必要的淋巴结切除。然而，巨大尺寸的 WSI 提供的高维信息远多于低维表格临床数据，这使得在多模态 WSI 分析任务中进行信息对齐极具挑战性。本文提出了一种新颖的基于 Transformer 的多模态多实例学习框架，用于从 WSI 和表格临床数据预测淋巴结转移。我们首先提出了一种有效的多实例分组方案，称为孪生注意力特征分组（SAG），用于将高维 WSI 分组为具有代表性的低维特征嵌入以进行融合。然后，我们设计了一种新颖的瓶颈共享特定特征迁移模块（BSFT），用于探索不同模态之间的共享和特定特征，其中利用了几个可学习的瓶颈令牌来实现模态之间的知识迁移。此外，我们还采用了模态自适应和正交投影方案，进一步鼓励 BSFT 从多模态数据中学习共享和特定特征。最后，通过注意力机制对幻灯片级别的预测进行动态聚合共享和特定特征。在我们收集的淋巴结转移数据集上的实验结果表明，我们提出的组件的效率很高，我们的框架的 AUC（曲线下面积）达到 97.34%，性能优于最先进的方法超过 1.27%。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于瓶颈融合 Transformer 的共享特定特征学习在多模态全切片图像分析中的应用。

Shared-Specific Feature Learning With Bottleneck Fusion Transformer for Multi-Modal Whole Slide Image Analysis.

出版信息

相似文献

引用本文的文献

基于瓶颈融合 Transformer 的共享特定特征学习在多模态全切片图像分析中的应用。

Shared-Specific Feature Learning With Bottleneck Fusion Transformer for Multi-Modal Whole Slide Image Analysis.

出版信息

相似文献

引用本文的文献