基于机器学习的模型预测 AFP 阳性肝细胞癌预后的研究

Development of a machine learning-based model to predict prognosis of alpha-fetoprotein-positive hepatocellular carcinoma.

机构信息

Department of Ultrasound, the First Affiliated Hospital of Anhui Medical University, Hefei, China.

Department of Ultrasound, Affiliated Hangzhou First People's Hospital, School of Medicine, Westlake University, Hangzhou, China.

出版信息

J Transl Med. 2024 May 13;22(1):455. doi: 10.1186/s12967-024-05203-w.

Abstract

BACKGROUND

Patients with alpha-fetoprotein (AFP)-positive hepatocellular carcinoma (HCC) have aggressive biological behavior and poor prognosis. Therefore, survival time is one of the greatest concerns for patients with AFP-positive HCC. This study aimed to demonstrate the utilization of six machine learning (ML)-based prognostic models to predict overall survival of patients with AFP-positive HCC.

METHODS

Data on patients with AFP-positive HCC were extracted from the Surveillance, Epidemiology, and End Results database. Six ML algorithms (extreme gradient boosting [XGBoost], logistic regression [LR], support vector machine [SVM], random forest [RF], K-nearest neighbor [KNN], and decision tree [ID3]) were used to develop the prognostic models of patients with AFP-positive HCC at one year, three years, and five years. Area under the receiver operating characteristic curve (AUC), confusion matrix, calibration curves, and decision curve analysis (DCA) were used to evaluate the model.

RESULTS

A total of 2,038 patients with AFP-positive HCC were included for analysis. The 1-, 3-, and 5-year overall survival rates were 60.7%, 28.9%, and 14.3%, respectively. Seventeen features regarding demographics and clinicopathology were included in six ML algorithms to generate a prognostic model. The XGBoost model showed the best performance in predicting survival at 1-year (train set: AUC = 0.771; test set: AUC = 0.782), 3-year (train set: AUC = 0.763; test set: AUC = 0.749) and 5-year (train set: AUC = 0.807; test set: AUC = 0.740). Furthermore, for 1-, 3-, and 5-year survival prediction, the accuracy in the training and test sets was 0.709 and 0.726, 0.721 and 0.726, and 0.778 and 0.784 for the XGBoost model, respectively. Calibration curves and DCA exhibited good predictive performance as well.

CONCLUSIONS

The XGBoost model exhibited good predictive performance, which may provide physicians with an effective tool for early medical intervention and improve the survival of patients.

摘要

背景

甲胎蛋白(AFP)阳性肝细胞癌(HCC)患者具有侵袭性的生物学行为和较差的预后。因此,生存时间是 AFP 阳性 HCC 患者最关心的问题之一。本研究旨在展示六种基于机器学习(ML)的预后模型在预测 AFP 阳性 HCC 患者总生存期方面的应用。

方法

从监测、流行病学和最终结果数据库中提取 AFP 阳性 HCC 患者的数据。使用六种 ML 算法(极端梯度提升[XGBoost]、逻辑回归[LR]、支持向量机[SVM]、随机森林[RF]、K-最近邻[KNN]和决策树[ID3])来开发 AFP 阳性 HCC 患者的一年、三年和五年的预后模型。采用受试者工作特征曲线下面积(AUC)、混淆矩阵、校准曲线和决策曲线分析(DCA)来评估模型。

结果

共纳入 2038 例 AFP 阳性 HCC 患者进行分析。患者 1 年、3 年和 5 年总生存率分别为 60.7%、28.9%和 14.3%。六种 ML 算法共纳入 17 个与人口统计学和临床病理学相关的特征,生成预后模型。XGBoost 模型在预测 1 年(训练集:AUC=0.771;测试集:AUC=0.782)、3 年(训练集:AUC=0.763;测试集:AUC=0.749)和 5 年(训练集:AUC=0.807;测试集:AUC=0.740)生存率方面表现最佳。此外,对于 1 年、3 年和 5 年生存率预测,XGBoost 模型在训练集和测试集中的准确率分别为 0.709 和 0.726、0.721 和 0.726、0.778 和 0.784。校准曲线和 DCA 也表现出良好的预测性能。

结论

XGBoost 模型具有良好的预测性能,可为医生提供早期医疗干预的有效工具,提高患者的生存率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5d9/11092049/a986221e849c/12967_2024_5203_Fig3_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索