Dey Nilanjan, Rajinikanth V, Fong Simon James, Kaiser M Shamim, Mahmud Mufti
Department of Information Technology, Techno India College of Technology, Kolkata, 700156 West Bengal India.
Department of Electronics and Instrumentation Engineering, St. Joseph's College of Engineering, Chennai, 600119 India.
Cognit Comput. 2020;12(5):1011-1023. doi: 10.1007/s12559-020-09751-3. Epub 2020 Aug 15.
The coronavirus disease (COVID-19) caused by a novel coronavirus, SARS-CoV-2, has been declared a global pandemic. Due to its infection rate and severity, it has emerged as one of the major global threats of the current generation. To support the current combat against the disease, this research aims to propose a machine learning-based pipeline to detect COVID-19 infection using lung computed tomography scan images (CTI). This implemented pipeline consists of a number of sub-procedures ranging from segmenting the COVID-19 infection to classifying the segmented regions. The initial part of the pipeline implements the segmentation of the COVID-19-affected CTI using social group optimization-based Kapur's entropy thresholding, followed by k-means clustering and morphology-based segmentation. The next part of the pipeline implements feature extraction, selection, and fusion to classify the infection. Principle component analysis-based serial fusion technique is used in fusing the features and the fused feature vector is then employed to train, test, and validate four different classifiers namely Random Forest, K-Nearest Neighbors (KNN), Support Vector Machine with Radial Basis Function, and Decision Tree. Experimental results using benchmark datasets show a high accuracy (> 91%) for the morphology-based segmentation task; for the classification task, the KNN offers the highest accuracy among the compared classifiers (> 87%). However, this should be noted that this method still awaits clinical validation, and therefore should not be used to clinically diagnose ongoing COVID-19 infection.
由新型冠状病毒SARS-CoV-2引起的冠状病毒病(COVID-19)已被宣布为全球大流行。由于其感染率和严重性,它已成为当代主要的全球威胁之一。为了支持当前对该疾病的抗击,本研究旨在提出一种基于机器学习的流程,用于使用肺部计算机断层扫描图像(CTI)检测COVID-19感染。这个实施的流程由许多子过程组成,从分割COVID-19感染区域到对分割区域进行分类。流程的初始部分使用基于社会群体优化的卡普尔熵阈值法对受COVID-19影响的CTI进行分割,随后进行k均值聚类和基于形态学的分割。流程的下一部分进行特征提取、选择和融合以对感染进行分类。基于主成分分析的串行融合技术用于融合特征,然后使用融合后的特征向量训练、测试和验证四种不同的分类器,即随机森林、K近邻(KNN)、具有径向基函数的支持向量机和决策树。使用基准数据集的实验结果表明,基于形态学的分割任务具有较高的准确率(>91%);对于分类任务,KNN在比较的分类器中提供了最高的准确率(>87%)。然而,应该注意的是,该方法仍有待临床验证,因此不应将其用于临床诊断正在进行的COVID-19感染。