School of Industrial and Systems Engineering, Tarbiat Modares University (TMU), 14117-13114, Tehran, Iran.
Computational Analysis and Modeling, Louisiana Tech University, Ruston, LA, USA.
Phys Eng Sci Med. 2021 Mar;44(1):291-311. doi: 10.1007/s13246-021-00980-w. Epub 2021 Feb 22.
Mycobacterium Tuberculosis (TB) is an infectious bacterial disease. In 2018, about 10 million people has been diagnosed with tuberculosis (TB) worldwide. Early diagnosis of TB is necessary for effective treatment, higher survival rate, and preventing its further transmission. The gold standard for tuberculosis diagnosis is sputum culture. Nevertheless, posterior-anterior chest radiographs (CXR) is an effective central method with low cost and a relatively low radiation dose for screening TB with immediate results. TB diagnosis from CXR is a challenging task requiring high level of expertise due to the diverse presentation of the disease. Significant intra-class variation and inter-class similarity in CXR images makes TB diagnosis from CXR a more challenging task. The main aim of this study is tuberculosis recognition from CXR images for reducing the disease burden. For this purpose, a novel multi-instance classification model is proposed in this study which is based on CNNs, complex networks and stacked ensemble (CCNSE). A main advantage of CCNSE is not requiring an accurate lung segmentation to localize the suspicious regions. Several overlapping patches are extracted from each CXR image. Features describing each patch are obtained by CNNs and then the feature vectors are clustered. Local complex networks (LCN) and global ones (GCN) of the cluster representatives are formed and feature engineering on LCN (GCN) generates other features at image-level (patch-level and image-level). Global clustering on these feature sets is performed for all patches. Each patch is assigned the purity score of its corresponding cluster. Patch-level features and purity scores are aggregated for each image. Finally, the images are classified with a proposed stacked ensemble classifier to normal and TB classes. Two datasets are used in this study including Montgomery County CXR set (MC) and Shenzhen dataset (SZ). MC/SZ includes 138/662 chest X-rays (CXR) from which 80 and 58/326 and 336 images belong to normal/TB classes, respectively. The experimental results show that the proposed method with AUC of 99.00 ± 0.28/98.00 ± 0.16 for MC/SZ and accuracy of 99.26 ± 0.40/99.22 ± 0.32 for MC/SZ with fivefold cross validation strategy is superior than the compared ones for diagnosis of TB from CXR images. The proposed method can be used as a computer-aided diagnosis system to reduce the manual time, effort and dependency to specialist's expertise level.
结核分枝杆菌(TB)是一种传染性细菌疾病。2018 年,全球约有 1000 万人被诊断患有结核病(TB)。早期诊断结核病对于有效治疗、提高生存率和防止进一步传播至关重要。结核病诊断的金标准是痰培养。然而,前后位胸部 X 线摄影(CXR)是一种有效的、具有成本效益的方法,其辐射剂量相对较低,可立即筛查结核病。由于疾病的表现多种多样,因此从 CXR 中诊断结核病是一项具有挑战性的任务,需要高水平的专业知识。CXR 图像中存在显著的类内变异性和类间相似性,这使得从 CXR 中诊断结核病更加具有挑战性。本研究的主要目的是从 CXR 图像中识别结核病,以减轻疾病负担。为此,本研究提出了一种基于卷积神经网络(CNN)、复杂网络和堆叠集成(CCNSE)的新型多实例分类模型。CCNSE 的主要优点是不需要准确的肺分割来定位可疑区域。从每张 CXR 图像中提取多个重叠的补丁。通过 CNN 获得描述每个补丁的特征,然后对特征向量进行聚类。形成代表的局部复杂网络(LCN)和全局复杂网络(GCN),并在 LCN(GCN)上进行特征工程,以在图像级别(补丁级别和图像级别)生成其他特征。对所有补丁进行这些特征集的全局聚类。为每个补丁分配其对应簇的纯度得分。聚合每个图像的补丁级特征和纯度得分。最后,使用提出的堆叠集成分类器将图像分类为正常和 TB 类。本研究使用了两个数据集,包括蒙哥马利县 CXR 集(MC)和深圳数据集(SZ)。MC/SZ 包括 138/662 张胸部 X 射线(CXR),其中 80/58 张和 326/336 张分别属于正常/TB 类。实验结果表明,与对比方法相比,在蒙哥马利县/深圳数据集上,该方法的 AUC 为 99.00±0.28/98.00±0.16,准确性为 99.26±0.40/99.22±0.32,具有五重交叉验证策略,优于用于从 CXR 图像诊断结核病的对比方法。该方法可作为计算机辅助诊断系统,以减少人工时间、努力和对专家专业水平的依赖。