Department of Information Technology, Jadavpur University, Jadavpur University Second Campus, Plot No. 8, Salt Lake Bypass, LB Block, Sector III, Salt Lake City, Kolkata, 700106, West Bengal, India.
Department of Mechanical Engineering, Faculty of Engineering and Information Technology, The University of Melbourne, Grattam Street, Parkville, VIC, 3010, Australia.
Sci Rep. 2023 Jun 19;13(1):9937. doi: 10.1038/s41598-023-36921-8.
Colorectal cancer is the third most common type of cancer diagnosed annually, and the second leading cause of death due to cancer. Early diagnosis of this ailment is vital for preventing the tumours to spread and plan treatment to possibly eradicate the disease. However, population-wide screening is stunted by the requirement of medical professionals to analyse histological slides manually. Thus, an automated computer-aided detection (CAD) framework based on deep learning is proposed in this research that uses histological slide images for predictions. Ensemble learning is a popular strategy for fusing the salient properties of several models to make the final predictions. However, such frameworks are computationally costly since it requires the training of multiple base learners. Instead, in this study, we adopt a snapshot ensemble method, wherein, instead of the traditional method of fusing decision scores from the snapshots of a Convolutional Neural Network (CNN) model, we extract deep features from the penultimate layer of the CNN model. Since the deep features are extracted from the same CNN model but for different learning environments, there may be redundancy in the feature set. To alleviate this, the features are fed into Particle Swarm Optimization, a popular meta-heuristic, for dimensionality reduction of the feature space and better classification. Upon evaluation on a publicly available colorectal cancer histology dataset using a five-fold cross-validation scheme, the proposed method obtains a highest accuracy of 97.60% and F1-Score of 97.61%, outperforming existing state-of-the-art methods on the same dataset. Further, qualitative investigation of class activation maps provide visual explainability to medical practitioners, as well as justifies the use of the CAD framework in screening of colorectal histology. Our source codes are publicly accessible at: https://github.com/soumitri2001/SnapEnsemFS .
结直肠癌是每年诊断出的第三大常见癌症类型,也是癌症死亡的第二大主要原因。这种疾病的早期诊断对于防止肿瘤扩散和制定可能根除疾病的治疗计划至关重要。然而,由于医疗专业人员需要手动分析组织学幻灯片,因此广泛的人群筛查受到限制。因此,本研究提出了一种基于深度学习的自动化计算机辅助检测(CAD)框架,该框架使用组织学幻灯片图像进行预测。集成学习是融合多个模型的显著特性以做出最终预测的一种流行策略。然而,这种框架计算成本很高,因为它需要训练多个基础学习者。相反,在本研究中,我们采用了快照集成方法,其中,我们不是从卷积神经网络(CNN)模型的快照融合决策分数,而是从 CNN 模型的倒数第二层提取深度特征。由于从同一 CNN 模型中提取了深度特征,但学习环境不同,因此特征集中可能存在冗余。为了缓解这种情况,将特征输入到粒子群优化中,这是一种流行的元启发式算法,用于特征空间的降维和更好的分类。在使用五折交叉验证方案对公开可用的结直肠癌组织学数据集进行评估时,所提出的方法获得了 97.60%的最高准确率和 97.61%的 F1 分数,优于同一数据集上现有的最先进方法。此外,对类激活图的定性研究为医疗从业者提供了可视化解释,并且证明了 CAD 框架在结直肠组织学筛查中的使用是合理的。我们的源代码可在以下网址公开获取:https://github.com/soumitri2001/SnapEnsemFS。