College of Computer Engineering and Sciences, Prince Sattam Bin AbdulAziz University, Al Kharj, Saudi Arabia.
Neural Netw. 2023 May;162:240-257. doi: 10.1016/j.neunet.2023.02.035. Epub 2023 Feb 27.
Breast cancer is common among women resulting in mortality when left untreated. Early detection is vital so that suitable treatment could assist cancer from spreading further and save people's life. The traditional way of detection is a time-consuming process. With the evolvement of DM (Data Mining), the healthcare industry could be benefitted in predicting the disease as it permits the physicians to determine the significant attributes for diagnosis. Though, conventional techniques have used DM-based methods to identify breast cancer, they lacked in terms of prediction rate. Moreover, parametric-Softmax classifiers have been a general option by conventional works with fixed classes, particularly when huge labelled data are present during training. Nevertheless, this turns into an issue for open set cases where new classes are encountered along with few instances to learn a generalized parametric classifier. Thus, the present study aims to implement a non-parametric strategy by optimizing the embedding of a feature rather than parametric classifiers. This research utilizes Deep CNN (Deep Convolutional Neural Network) and Inception V3 for learning visual features which preserve neighbourhood outline in semantic space relying on NCA (Neighbourhood Component Analysis) criteria. Delimited by its bottleneck, the study proposes MS-NCA (Modified Scalable-Neighbourhood Component Analysis) that relies on a non-linear objective function to perform feature fusion by optimizing the distance-learning objective due to which it gains the capability of computing inner feature products without performing mapping which increases the scalability of MS-NCA. Finally, G-HPO (Genetic-Hyper-parameter Optimization) is proposed. In this case, the new stage in the algorithm simply denotes the enhancement in the length of chromosome bringing several hyperparameters into subsequent XGBoost, NB and RF models having numerous layers for identifying the normal and affected cases of breast cancer for which optimized hyper-parameter values of RF (Random Forest), NB (Naïve Bayes), and XGBoost (eXtreme Gradient Boosting) are determined. This process helps in improvising the classification rate which is confirmed through analytical results.
乳腺癌在女性中很常见,如果不治疗会导致死亡。早期发现至关重要,以便进行适当的治疗,防止癌症进一步扩散,挽救人们的生命。传统的检测方法是一个耗时的过程。随着数据挖掘(Data Mining)的发展,医疗保健行业可以从预测疾病中受益,因为它允许医生确定用于诊断的重要属性。尽管如此,传统技术在识别乳腺癌方面已经使用了基于数据挖掘的方法,但在预测率方面存在不足。此外,参数化-Softmax 分类器一直是传统工作的一般选择,具有固定的类别,特别是在训练期间存在大量标记数据时。然而,对于开放集情况,这会成为一个问题,因为新类别会与学习广义参数分类器的少数实例一起出现。因此,本研究旨在通过优化特征的嵌入来实现非参数化策略,而不是参数化分类器。本研究利用深度卷积神经网络(Deep CNN)和 Inception V3 学习视觉特征,这些特征根据 NCA(邻域成分分析)标准在语义空间中保留邻域轮廓。受瓶颈限制,该研究提出了 MS-NCA(改进的可扩展邻域成分分析),它依赖于非线性目标函数来执行特征融合,通过优化距离学习目标来实现,这使得它能够计算内部特征产品,而无需执行映射,从而提高了 MS-NCA 的可扩展性。最后,提出了 G-HPO(遗传-超参数优化)。在这种情况下,算法的新阶段简单地表示染色体长度的增强,将多个超参数引入后续的 XGBoost、NB 和 RF 模型中,这些模型具有多个用于识别乳腺癌正常和受影响病例的层,为 RF(随机森林)、NB(朴素贝叶斯)和 XGBoost(极端梯度提升)确定了优化的超参数值。这个过程有助于提高分类率,这通过分析结果得到了证实。