Wang Ying-Ying, Yang Wan-Xia, Du Qia-Jun, Liu Zhen-Hua, Lu Ming-Hua, You Chong-Ge
Laboratory Medicine Center, The Second Hospital & Clinical Medical School, Lanzhou University, Lanzhou 730030, Gansu Province, China.
World J Gastrointest Oncol. 2024 Sep 15;16(9):3839-3850. doi: 10.4251/wjgo.v16.i9.3839.
Liver cancer is one of the most prevalent malignant tumors worldwide, and its early detection and treatment are crucial for enhancing patient survival rates and quality of life. However, the early symptoms of liver cancer are often not obvious, resulting in a late-stage diagnosis in many patients, which significantly reduces the effectiveness of treatment. Developing a highly targeted, widely applicable, and practical risk prediction model for liver cancer is crucial for enhancing the early diagnosis and long-term survival rates among affected individuals.
To develop a liver cancer risk prediction model by employing machine learning techniques, and subsequently assess its performance.
In this study, a total of 550 patients were enrolled, with 190 hepatocellular carcinoma (HCC) and 195 cirrhosis patients serving as the training cohort, and 83 HCC and 82 cirrhosis patients forming the validation cohort. Logistic regression (LR), support vector machine (SVM), random forest (RF), and least absolute shrinkage and selection operator (LASSO) regression models were developed in the training cohort. Model performance was assessed in the validation cohort. Additionally, this study conducted a comparative evaluation of the diagnostic efficacy between the ASAP model and the model developed in this study using receiver operating characteristic curve, calibration curve, and decision curve analysis (DCA) to determine the optimal predictive model for assessing liver cancer risk.
Six variables including age, white blood cell, red blood cell, platelet counts, alpha-fetoprotein and protein induced by vitamin K absence or antagonist II levels were used to develop LR, SVM, RF, and LASSO regression models. The RF model exhibited superior discrimination, and the area under curve of the training and validation sets was 0.969 and 0.858, respectively. These values significantly surpassed those of the LR (0.850 and 0.827), SVM (0.860 and 0.803), LASSO regression (0.845 and 0.831), and ASAP (0.866 and 0.813) models. Furthermore, calibration and DCA indicated that the RF model exhibited robust calibration and clinical validity.
The RF model demonstrated excellent prediction capabilities for HCC and can facilitate early diagnosis of HCC in clinical practice.
肝癌是全球最常见的恶性肿瘤之一,其早期检测和治疗对于提高患者生存率和生活质量至关重要。然而,肝癌的早期症状往往不明显,导致许多患者在晚期才被诊断出来,这显著降低了治疗效果。开发一种针对肝癌的高度靶向、广泛适用且实用的风险预测模型对于提高受影响个体的早期诊断率和长期生存率至关重要。
运用机器学习技术开发肝癌风险预测模型,并随后评估其性能。
在本研究中,共纳入550例患者,其中190例肝细胞癌(HCC)患者和195例肝硬化患者作为训练队列,83例HCC患者和82例肝硬化患者组成验证队列。在训练队列中开发逻辑回归(LR)、支持向量机(SVM)、随机森林(RF)和最小绝对收缩和选择算子(LASSO)回归模型。在验证队列中评估模型性能。此外,本研究使用受试者工作特征曲线、校准曲线和决策曲线分析(DCA)对ASAP模型和本研究开发的模型之间的诊断效能进行了比较评估,以确定评估肝癌风险的最佳预测模型。
使用年龄、白细胞、红细胞、血小板计数、甲胎蛋白和维生素K缺乏或拮抗剂II诱导蛋白水平这六个变量开发了LR、SVM、RF和LASSO回归模型。RF模型表现出卓越的区分能力,训练集和验证集的曲线下面积分别为0.969和0.858。这些值显著超过了LR(0.850和0.827)、SVM(0.860和0.803)、LASSO回归(0.845和0.831)以及ASAP(0.86和0.813)模型。此外,校准和DCA表明RF模型具有稳健的校准和临床有效性。
RF模型对HCC显示出出色的预测能力,可在临床实践中促进HCC的早期诊断。