School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China.
School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China.
Comput Biol Med. 2023 Oct;165:107387. doi: 10.1016/j.compbiomed.2023.107387. Epub 2023 Aug 28.
Multimodal medical image detection is a key technology in medical image analysis, which plays an important role in tumor diagnosis. There are different sizes lesions and different shapes lesions in multimodal lung tumor images, which makes it difficult to effectively extract key features of lung tumor lesions.
A Cross-modal Cross-scale Clobal-Local Attention YOLOV5 Lung Tumor Detection Model (CCGL-YOLOV5) is proposed in this paper. The main works are as follows: Firstly, the Cross-Modal Fusion Transformer Module (CMFTM) is designed to improve the multimodal key lesion feature extraction ability and fusion ability through the interactive assisted fusion of multimodal features; Secondly, the Global-Local Feature Interaction Module (GLFIM) is proposed to enhance the interaction ability between multimodal global features and multimodal local features through bidirectional interactive branches. Thirdly, the Cross-Scale Attention Fusion Module (CSAFM) is designed to obtain rich multi-scale features through grouping multi-scale attention for feature fusion.
The comparison experiments with advanced networks are done. The Acc, Rec, mAP, F1 score and FPS of CCGL-YOLOV5 model on multimodal lung tumor PET/CT dataset are 97.83%, 97.39%, 96.67%, 97.61% and 98.59, respectively; The experimental results show that the performance of CCGL-YOLOV5 model in this paper are better than other typical models.
The CCGL-YOLOV5 model can effectively use the multimodal feature information. There are important implications for multimodal medical image research and clinical disease diagnosis in CCGL-YOLOV5 model.
多模态医学图像检测是医学图像分析中的一项关键技术,在肿瘤诊断中起着重要作用。多模态肺肿瘤图像中存在不同大小和形状的病变,这使得有效提取肺肿瘤病变的关键特征变得困难。
本文提出了一种跨模态跨尺度全局-局部注意力 YOLOV5 肺肿瘤检测模型(CCGL-YOLOV5)。主要工作如下:首先,设计了跨模态融合 Transformer 模块(CMFTM),通过多模态特征的交互辅助融合,提高多模态关键病变特征提取能力和融合能力;其次,提出了全局-局部特征交互模块(GLFIM),通过双向交互分支增强多模态全局特征和多模态局部特征之间的交互能力;第三,设计了跨尺度注意力融合模块(CSAFM),通过分组多尺度注意力进行特征融合,获得丰富的多尺度特征。
与先进网络进行了对比实验。CCGL-YOLOV5 模型在多模态肺肿瘤 PET/CT 数据集上的 Acc、Rec、mAP、F1 分数和 FPS 分别为 97.83%、97.39%、96.67%、97.61%和 98.59%;实验结果表明,本文提出的 CCGL-YOLOV5 模型的性能优于其他典型模型。
CCGL-YOLOV5 模型可以有效地利用多模态特征信息。CCGL-YOLOV5 模型对多模态医学图像研究和临床疾病诊断具有重要意义。