ViruSeg：利用大语言-图像模型的力量增强病毒图像分割

ViruSeg: Harnessing the Power of Large Language-Image Model for Enhanced Virus Image Segmentation.

作者信息

Wang Shengxiang, Fu Xiangzheng, Du Zhenya, Liu Xinxin, Mai Qiaochu, Zhuo Linlin, Xie Boqia, Zou Quan

机构信息

School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325027, Wenzhou, China.

School of Nursing, Teaching and Research Department of Public Medical Courses, Guangzhou Xinhua University, 510520, Guangzhou, China.

出版信息

Interdiscip Sci. 2025 Jun 27. doi: 10.1007/s12539-025-00711-9.

DOI:10.1007/s12539-025-00711-9

PMID:40579681

Abstract

The emergence of novel viral diseases, with SARS-CoV-2 as a stark example, poses increasing threats to public health, causing significant global morbidity and mortality. Accurate identification and segmentation of viral imaging are crucial for tracking virus progression and mutations, and for devising new treatment strategies. Advanced virus recognition and segmentation models, utilizing high-performance networks like U-Net, have achieved notable success. However, these models struggle with multiple challenges, including limited labeled virus images, significant morphological variability, and indistinct boundaries. Consequently, this study introduces ViruSeg, based on the EVA-02 large language-image pre-trained model and data augmentation techniques, designed to efficiently perform virus segmentation tasks. Initially, the ViruSeg model employs data augmentation techniques like cutout and image fine-tuning to enrich electron microscope virus images, enhancing model generalization and effectively delineating virus boundaries and different forms. Secondly, ViruSeg utilizes the EVA-02 pre-trained model to learn a universal representation of virus images, enhancing adaptability to data scarcity. Finally, virus segmentation is conducted using the Cascade Mask R-CNN (CMR) model. Comprehensive evaluations on benchmark datasets demonstrate the superior performance of ViruSeg compared to advanced virus segmentation methods. We anticipate that the proposed solution will advance virology research and the development of treatments for related diseases. All dataset and code are available through https://github.com/xiachashuanghua/project .

摘要

以严重急性呼吸综合征冠状病毒2（SARS-CoV-2）为例，新型病毒性疾病的出现对公共卫生构成了越来越大的威胁，导致全球大量发病和死亡。病毒成像的准确识别和分割对于追踪病毒进展和突变以及制定新的治疗策略至关重要。利用U-Net等高性能网络的先进病毒识别和分割模型已取得显著成功。然而，这些模型面临多种挑战，包括标记病毒图像有限、形态变异大以及边界不清晰。因此，本研究基于EVA-02大语言-图像预训练模型和数据增强技术引入了ViruSeg，旨在高效执行病毒分割任务。首先，ViruSeg模型采用随机遮挡和图像微调等数据增强技术来丰富电子显微镜病毒图像，增强模型泛化能力并有效勾勒病毒边界和不同形态。其次，ViruSeg利用EVA-02预训练模型来学习病毒图像的通用表示，增强对数据稀缺的适应性。最后，使用级联掩码区域卷积神经网络（CMR）模型进行病毒分割。在基准数据集上的综合评估表明，ViruSeg的性能优于先进的病毒分割方法。我们预计所提出的解决方案将推动病毒学研究以及相关疾病治疗方法的发展。所有数据集和代码可通过https://github.com/xiachashuanghua/project获取。