Shen Lujun, Zhang Tao, Xu Jian, Jiang Yiquan, Cao Fei, Chen Qifeng, Li Chen, Nuerhashi Gulijiayina, Li Wang, Wu Peihong, Fan Weijun
Department of Minimally Invasive Therapy, Sun Yat-sen University Cancer Center, Guangzhou 510060, P. R. China.
State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, P. R. China.
Bioinform Adv. 2025 Feb 17;5(1):vbaf027. doi: 10.1093/bioadv/vbaf027. eCollection 2025.
Patients with intermediate stage hepatocellular carcinoma (HCC) require repeated disease monitoring, prognosis assessment, and treatment planning. A novel machine learning model called survival path mapping (SP) model was developed, while its performance as compared with conventional machine learning models remains unknown. Between January 2007 and December 2018, the time-series data of 2644 intermediate stage HCC patients from four medical centers in China were reviewed and included. Static machine learning models by Gaussian Naive Bayes (GNB), support vector machine (SVM), and random forest (RF) for the prediction of survivorship were built based on data at initial admission. Longitudinal data divided into different time slices were utilized for the construction of the SP model. The time-dependent -index was compared between models.
The training set, internal testing set, and external testing set consisted of 1560, 670, and 414 HCC patients, respectively. The survival path model had superior or non-inferior performance in prognosis prediction compared to GNB and RF models since the 12th month after initial diagnosis in the training set and the external testing set. The survival path model had higher time-dependent -index over all conventional ML models since the 6th month in the external testing cohort. In conclusion, the survival path model had superior performance in long-term dynamic prognosis prediction compared to conventional static machine learning models for intermediate stage HCC.
The parameters of models are provided in the manuscript.
中期肝细胞癌(HCC)患者需要反复进行疾病监测、预后评估和治疗规划。一种名为生存路径映射(SP)模型的新型机器学习模型被开发出来,但其与传统机器学习模型相比的性能仍不清楚。在2007年1月至2018年12月期间,对来自中国四个医疗中心的2644例中期HCC患者的时间序列数据进行了回顾并纳入研究。基于初始入院时的数据,构建了高斯朴素贝叶斯(GNB)、支持向量机(SVM)和随机森林(RF)等静态机器学习模型用于预测生存率。将纵向数据划分为不同的时间片,用于构建SP模型。比较了各模型之间的时间依赖性指标。
训练集、内部测试集和外部测试集分别由1560例、670例和414例HCC患者组成。在训练集和外部测试集中,自初始诊断后第12个月起,生存路径模型在预后预测方面具有优于或不劣于GNB和RF模型的性能。在外部测试队列中,自第6个月起,生存路径模型在所有传统机器学习模型中具有更高的时间依赖性指标。总之,与用于中期HCC的传统静态机器学习模型相比,生存路径模型在长期动态预后预测方面具有优越的性能。
手稿中提供了模型的参数。