Ahamad Md Martuza, Aktar Sakifa, Uddin Md Jamal, Rahman Tasnia, Alyami Salem A, Al-Ashhab Samer, Akhdar Hanan Fawaz, Azad Akm, Moni Mohammad Ali
Department of Computer Science and Engineering, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh.
Department of Computer Science and Engineering, Rajshahi University of Engineering and Technology, Rajshahi 6200, Bangladesh.
J Pers Med. 2022 Jul 25;12(8):1211. doi: 10.3390/jpm12081211.
One of the common types of cancer for women is ovarian cancer. Still, at present, there are no drug therapies that can properly cure this deadly disease. However, early-stage detection could boost the life expectancy of the patients. The main aim of this work is to apply machine learning models along with statistical methods to the clinical data obtained from 349 patient individuals to conduct predictive analytics for early diagnosis. In statistical analysis, Student's -test as well as log fold changes of two groups are used to find the significant blood biomarkers. Furthermore, a set of machine learning models including Random Forest (RF), Support Vector Machine (SVM), Decision Tree (DT), Extreme Gradient Boosting Machine (XGBoost), Logistic Regression (LR), Gradient Boosting Machine (GBM) and Light Gradient Boosting Machine (LGBM) are used to build classification models to stratify benign-vs.-malignant ovarian cancer patients. Both of the analysis techniques recognized that the serumsamples carbohydrate antigen 125, carbohydrate antigen 19-9, carcinoembryonic antigen and human epididymis protein 4 are the top-most significant biomarkers as well as neutrophil ratio, thrombocytocrit, hematocrit blood samples, alanine aminotransferase, calcium, indirect bilirubin, uric acid, natriumas as general chemistry tests. Moreover, the results from predictive analysis suggest that the machine learning models can classify malignant patients from benign patients with accuracy as good as 91%. Since generally, early-stage detection is not available, machine learning detection could play a significant role in cancer diagnosis.
卵巢癌是女性常见的癌症类型之一。然而,目前尚无能够有效治愈这种致命疾病的药物疗法。不过,早期检测可以提高患者的预期寿命。这项工作的主要目的是将机器学习模型与统计方法应用于从349名患者个体获得的临床数据,以进行早期诊断的预测分析。在统计分析中,使用学生t检验以及两组的对数变化倍数来寻找重要的血液生物标志物。此外,还使用了一组机器学习模型,包括随机森林(RF)、支持向量机(SVM)、决策树(DT)、极端梯度提升机(XGBoost)、逻辑回归(LR)、梯度提升机(GBM)和轻量级梯度提升机(LGBM)来构建分类模型,以区分良性和恶性卵巢癌患者。两种分析技术都认为,血清样本中的糖类抗原125、糖类抗原19-9、癌胚抗原和人附睾蛋白4是最重要的生物标志物,以及中性粒细胞比率、血小板压积、血细胞比容血样、谷丙转氨酶、钙、间接胆红素、尿酸、钠等作为一般化学检测指标。此外,预测分析结果表明,机器学习模型能够以高达91%的准确率将恶性患者与良性患者区分开来。由于一般情况下无法进行早期检测,机器学习检测在癌症诊断中可以发挥重要作用。