Tian Geng, Wang Ziwei, Wang Chang, Chen Jianhua, Liu Guangyi, Xu He, Lu Yuankang, Han Zhuoran, Zhao Yubo, Li Zejun, Luo Xueming, Peng Lihong
School of Computer Science, Hunan University of Technology, Zhuzhou, China.
Geneis (Beijing) Co., Ltd., Beijing, China.
Front Microbiol. 2022 Nov 4;13:1024104. doi: 10.3389/fmicb.2022.1024104. eCollection 2022.
Since the outbreak of COVID-19, hundreds of millions of people have been infected, causing millions of deaths, and resulting in a heavy impact on the daily life of countless people. Accurately identifying patients and taking timely isolation measures are necessary ways to stop the spread of COVID-19. Besides the nucleic acid test, lung CT image detection is also a path to quickly identify COVID-19 patients. In this context, deep learning technology can help radiologists identify COVID-19 patients from CT images rapidly. In this paper, we propose a deep learning ensemble framework called VitCNX which combines Vision Transformer and ConvNeXt for COVID-19 CT image identification. We compared our proposed model VitCNX with EfficientNetV2, DenseNet, ResNet-50, and Swin-Transformer which are state-of-the-art deep learning models in the field of image classification, and two individual models which we used for the ensemble (Vision Transformer and ConvNeXt) in binary and three-classification experiments. In the binary classification experiment, VitCNX achieves the best recall of 0.9907, accuracy of 0.9821, F1-score of 0.9855, AUC of 0.9985, and AUPR of 0.9991, which outperforms the other six models. Equally, in the three-classification experiment, VitCNX computes the best precision of 0.9668, an accuracy of 0.9696, and an F1-score of 0.9631, further demonstrating its excellent image classification capability. We hope our proposed VitCNX model could contribute to the recognition of COVID-19 patients.
自新冠疫情爆发以来,数亿人被感染,导致数百万例死亡,并对无数人的日常生活造成了沉重影响。准确识别患者并及时采取隔离措施是阻止新冠病毒传播的必要手段。除了核酸检测外,肺部CT图像检测也是快速识别新冠患者的一条途径。在此背景下,深度学习技术可以帮助放射科医生从CT图像中快速识别新冠患者。在本文中,我们提出了一种名为VitCNX的深度学习集成框架,它结合了视觉Transformer(Vision Transformer)和卷积神经网络(ConvNeXt)用于新冠CT图像识别。我们将我们提出的模型VitCNX与高效神经网络V2(EfficientNetV2)、密集连接网络(DenseNet)、残差网络50(ResNet-50)和Swin Transformer进行了比较,这些都是图像分类领域的先进深度学习模型,并且还在二分类和三分类实验中与我们用于集成的两个单独模型(视觉Transformer和卷积神经网络)进行了比较。在二分类实验中,VitCNX实现了最佳召回率0.9907、准确率0.9821、F1分数0.9855、曲线下面积(AUC)0.9985和精确率均值-召回率曲线下面积(AUPR)0.9991,优于其他六个模型。同样,在三分类实验中,VitCNX计算出最佳精确率0.9668、准确率0.9696和F1分数0.9631,进一步证明了其出色的图像分类能力。我们希望我们提出的VitCNX模型能够有助于新冠患者的识别。