Kumar Sunil, Kumar Harish
Department of Information Technology, School of Engineering and Technology (UIET), CSJM University, Kanpur, UP, India.
Department of Computer Engineering, J.C. Bose University of Science and Technology, YMCA, Faridabad, India.
MethodsX. 2023 Jul 22;11:102295. doi: 10.1016/j.mex.2023.102295. eCollection 2023 Dec.
COVID-19 is a highly transmissible infectious disease that remains a substantial challenge. The utilization of chest radiology, particularly X-ray imaging, has proven to be highly effective, easily accessible, and cost-efficient in detecting COVID-19. A dataset named COVID-Xray-5k, consisting of imbalanced X-ray images of COVID-19-positive and normal subjects, is employed for investigation. The research introduces a novel methodology that utilizes conventional machine learning (ML), such as local binary patterns (LBP) for feature extraction and support vector machines (SVM) for classification. In addition, transfer learning is employed with the Visual Geometry Group 16-layer (VGG16) and 19-layer (VGG19) models. Besides, novel sequential convolutional neural network (CNN) architectures are presented to develop an autonomous system for classifying COVID-19. One of the proposed CNN architectures classifies the test dataset with an F1 score of 91.00% and an accuracy of 99.45% based on an empirical investigation to determine optimal hyper-parameters. The methods presented in the research show promising potential for COVID-19 classification, irrespective of class imbalance.•Employment of ML models to investigate subjective feature engineering and classification.•Transfer learning was employed for VGG16 and VGG19 with eight distinct models.•Illustration of two novel CNN sequential architectures; all the investigation is performed with and without weighted sampling.
新冠病毒病(COVID-19)是一种具有高度传染性的传染病,仍然是一项重大挑战。胸部放射学检查,尤其是X线成像,在检测COVID-19方面已被证明非常有效、易于获取且成本效益高。一个名为COVID-Xray-5k的数据集被用于研究,该数据集由COVID-19阳性和正常受试者的不平衡X线图像组成。该研究引入了一种新颖的方法,利用传统机器学习(ML),如使用局部二值模式(LBP)进行特征提取,使用支持向量机(SVM)进行分类。此外,还使用了视觉几何组16层(VGG16)和19层(VGG19)模型进行迁移学习。此外,还提出了新颖的序列卷积神经网络(CNN)架构,以开发一个用于对COVID-19进行分类的自主系统。基于确定最佳超参数的实证研究,所提出的一种CNN架构对测试数据集的F1分数为91.00%,准确率为99.45%。该研究中提出的方法在COVID-19分类方面显示出了有前景的潜力,无论类别是否不平衡。
•使用ML模型来研究主观特征工程和分类。
•对VGG16和VGG19使用了八种不同模型进行迁移学习。
•展示了两种新颖的CNN序列架构;所有研究都在有和没有加权采样的情况下进行。