Department of Computer Science, Faculty of Business Administration and Information Technology, Rajamangala University of Technology Tawan-Ok, Thailand.
Asian Pac J Cancer Prev. 2021 Dec 1;22(12):4069-4074. doi: 10.31557/APJCP.2021.22.12.4069.
OBJECTIVE: Breast cancer patients who have a rapid diagnosis have been better prognosis than late diagnosis. The popular screening is mammogram or ultrasound. In recent years, researchers try to develop data driven models to predict early cancer staging from the first screening. However, data elements are not complete such as lymph node status. Therefore, the Integrated dataset approach will be challenging. METHODS: Because the data elements are not collected from the same source, joining between mammography and biopsy data were performed using latent variables that determine by tumor severity. The datasets consist of 445 mammography reports and 183 pathological reports. The latent variables of the mammogram dataset were determined by the severity of mass, while latent variables of the pathological dataset were determined by TNM Staging. The latent variables were used to join between two datasets. Then, the prediction models were built using the machine learning technique. The modeling is divided into three steps; staging prediction, lymph node prediction, and prognosis. RESULTS: Integrated dataset from mammography and biopsy extend more factors and built the models to predict breast cancer staging in the mammography process. The staging prediction is 100% accuracy. The lymph node prediction are 72.47% accuracy, 73.94% specificity, and 72.5% sensitivity. An area under ROC curve is 0.74. The prognosis model prediction are 72.72% accuracy, 80% specificity, and 77% sensitivity. An area under ROC curve is 0.87. There are also built the rule for early staging, diagnosis, and prognosis. Conclusion: This study aims to build the models for early staging, diagnosis, and prognosis using the less aggressive method. The advantages are (1) predict staging from the first screening (2) estimate the lymph node metastases for planning to ALND or SLNB (3) evaluate overall survival time. These advantages help the physician planning the best treatment for cancer patients.
目的:与晚期诊断相比,快速诊断的乳腺癌患者预后更好。常用的筛查方法是乳房 X 光摄影或超声检查。近年来,研究人员试图开发数据驱动的模型,以从首次筛查中预测早期癌症分期。然而,数据元素并不完整,例如淋巴结状态。因此,综合数据集方法将具有挑战性。
方法:由于数据元素不是从同一来源收集的,因此使用通过肿瘤严重程度确定的潜在变量来执行乳房 X 光摄影和活检数据之间的连接。数据集包括 445 份乳房 X 光报告和 183 份病理报告。乳房 X 光摄影数据集的潜在变量由肿块的严重程度决定,而病理数据集的潜在变量由 TNM 分期决定。潜在变量用于连接两个数据集。然后,使用机器学习技术构建预测模型。建模分为三个步骤;分期预测、淋巴结预测和预后。
结果:来自乳房 X 光摄影和活检的综合数据集扩展了更多因素,并构建了在乳房 X 光摄影过程中预测乳腺癌分期的模型。分期预测的准确率为 100%。淋巴结预测的准确率为 72.47%,特异性为 73.94%,敏感性为 72.5%。ROC 曲线下面积为 0.74。预后模型预测的准确率为 72.72%,特异性为 80%,敏感性为 77%。ROC 曲线下面积为 0.87。还建立了早期分期、诊断和预后的规则。
结论:本研究旨在使用较少激进的方法构建早期分期、诊断和预后的模型。其优势在于:(1)从首次筛查中预测分期;(2)评估淋巴结转移情况,为 ALND 或 SLNB 规划提供参考;(3)评估总生存时间。这些优势有助于医生为癌症患者制定最佳治疗方案。
Asian Pac J Cancer Prev. 2021-12-1
World J Surg Oncol. 2020-5-29
Eur J Surg Oncol. 2019-5-16
Technol Health Care. 2016
Cancers (Basel). 2024-3-19
Asian Pac J Cancer Prev. 2017-10-26
Comput Methods Programs Biomed. 2017-6-3
IEEE J Biomed Health Inform. 2015-3-20
Comput Struct Biotechnol J. 2014-11-15
Oncologist. 2004