Galván-Tejada Carlos E, Zanella-Calzada Laura A, Galván-Tejada Jorge I, Celaya-Padilla José M, Gamboa-Rosales Hamurabi, Garza-Veloz Idalia, Martinez-Fierro Margarita L
Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juarez 147, Centro, 98000 Zacatecas, Zac, Mexico.
Facultad de Ciencias, Universidad Autónoma de San Luis Potosí, Lateral Av. Salvador Nava s/n., 78290 San Luis Potosí, SLP, Mexico.
Diagnostics (Basel). 2017 Feb 14;7(1):9. doi: 10.3390/diagnostics7010009.
Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.
乳腺癌是一个重要的全球健康问题,也是女性中最常见的癌症类型。晚期诊断会显著降低患者的生存率;然而,使用乳房X光检查进行早期检测已被证明是提高生存率的一项非常重要的工具。本文的目的是通过在乳房X光图像特征的训练和测试数据集中使用遗传算法进行计算机辅助诊断,获得一个用于对良性和恶性肿瘤病变进行分类的多变量模型。进行了多变量搜索以使用不同方法获得预测模型,以便比较和验证结果。多变量模型是使用随机森林、最近质心和K近邻(K-NN)策略作为成本函数构建的,应用于BCDR公共数据库中的特征的遗传算法。结果表明,根据其适应度值,与由所有特征组成的多变量模型相比,多变量模型中获得的两个纹理描述符特征在对数据结果进行分类时具有相似或更好的预测能力。该模型有助于减轻放射科医生的工作量,并在肿瘤病变分类中提供第二种观点。