School of Information and Communication Engineering, Communication University of China, Beijing, China.
Department of Radiology, The Second Affiliated Hospital of Shenyang Medical College, Shenyang, China.
BMC Med Inform Decis Mak. 2023 Apr 6;23(1):64. doi: 10.1186/s12911-023-02142-2.
Breast cancer (BC) is one of the most common cancers among women. Since diverse features can be collected, how to stably select the powerful ones for accurate BC diagnosis remains challenging.
A hybrid framework is designed for successively investigating both feature ranking (FR) stability and cancer diagnosis effectiveness. Specifically, on 4 BC datasets (BCDR-F03, WDBC, GSE10810 and GSE15852), the stability of 23 FR algorithms is evaluated via an advanced estimator (S), and the predictive power of the stable feature ranks is further tested by using different machine learning classifiers.
Experimental results identify 3 algorithms achieving good stability ([Formula: see text]) on the four datasets and generalized Fisher score (GFS) leading to state-of-the-art performance. Moreover, GFS ranks suggest that shape features are crucial in BC image analysis (BCDR-F03 and WDBC) and that using a few genes can well differentiate benign and malignant tumor cases (GSE10810 and GSE15852).
The proposed framework recognizes a stable FR algorithm for accurate BC diagnosis. Stable and effective features could deepen the understanding of BC diagnosis and related decision-making applications.
乳腺癌(BC)是女性最常见的癌症之一。由于可以收集到多种特征,如何稳定地选择强大的特征来进行准确的 BC 诊断仍然具有挑战性。
设计了一种混合框架,用于连续研究特征排序(FR)稳定性和癌症诊断效果。具体来说,在 4 个 BC 数据集(BCDR-F03、WDBC、GSE10810 和 GSE15852)上,通过一种先进的估计器(S)评估了 23 种 FR 算法的稳定性,并用不同的机器学习分类器进一步测试了稳定特征排序的预测能力。
实验结果确定了在四个数据集上具有良好稳定性([公式:见正文])的 3 种算法,以及表现最佳的广义 Fisher 得分(GFS)。此外,GFS 排序表明,形状特征在 BC 图像分析(BCDR-F03 和 WDBC)中至关重要,并且使用少数基因可以很好地区分良性和恶性肿瘤病例(GSE10810 和 GSE15852)。
所提出的框架识别出一种用于准确 BC 诊断的稳定 FR 算法。稳定且有效的特征可以加深对 BC 诊断和相关决策应用的理解。