Shen Duo, Sha Ling, Yang Ling, Gu Xuefeng
Department of Gastroenterology, The Second People's Hospital of Changzhou, the Third Affiliated Hospital of Nanjing Medical University, Changzhou, Jiangsu, China.
Department of Neurology, Nanjing Drum Tower Hospital, Affiliated to Nanjing University Medical School, Nanjing, Jiangsu, China.
BMC Infect Dis. 2025 Feb 1;25(1):151. doi: 10.1186/s12879-025-10566-6.
Hepatitis B-associated cirrhosis (HBC) is associated with severe complications and adverse clinical outcomes. This study aimed to develop and validate a predictive model for the occurrence of multiple complications (three or more) in patients with HBC and to explore the effects of multiple complications on HBC prognosis.
In this retrospective cohort study, data from 121 HBC patients treated at Nanjing Second Hospital from February 2009 to November 2019 were analysed. The maximum follow-up period was 10.75 years, with a median of 5.75 years. Eight machine learning techniques were employed to construct predictive models, including C5.0, linear discriminant analysis (LDA), least absolute shrinkage and selection operator (LASSO), k-nearest neighbour (KNN), gradient boosting decision tree (GBDT), support vector machine (SVM), generalised linear model (GLM) and naive Bayes (NB), utilising variables such as medical history, demographics, clinical signs, and laboratory test results. Model performance was evaluated via receiver operating characteristic (ROC) curve analysis, residual analysis, calibration curve analysis, and decision curve analysis (DCA). The influence of multiple complications on HBC survival time was assessed via Kaplan‒Meier curve analysis. Furthermore, LASSO and univariable and multivariable Cox regression analyses were conducted to identify independent prognostic factors for overall survival (OS) in patients with HBC, followed by ROC, C-index, calibration curve, and DCA curve analyses of the constructed prognostic nomogram model. This study utilized bootstrap resampling for internal validation and employed the Medical Information Mart for Intensive Care IV (MIMIC-IV) database for external validation.
The GBDT model exhibited the highest area under the curve (AUC) and emerged as the optimal model for predicting the occurrence of multiple complications. The key predictive factors included posthospitalisation fever (PHF), body mass index (BMI), retinol binding protein (RBP), total bilirubin (TB) levels, and eosinophils (EOS). Kaplan-Meier analysis revealed that patients with multiple complications had significantly worse OS than those with fewer complications. Additionally, multivariable Cox regression analysis, informed by least absolute shrinkage and LASSO selection, identified hepatocellular carcinoma (HCC), multiple complications, and lactate dehydrogenase (LDH) levels as independent prognostic factors for OS. The prognostic model demonstrated 1-year, 3-year, and 5-year OS ROC AUCs of 0.802, 0.793, and 0.817, respectively. For the internal validation cohort, the corresponding AUC values were 0.797, 0.832, and 0.835. In contrast, the external validation cohort yielded a 1-year ROC AUC of 0.707. Calibration curves indicated good consistency of the model, and DCA demonstrated the model's clinical utility, showing high net benefits within certain threshold ranges. Compared with the univariable models, the multivariable ROC curves indicated higher AUC values for this prognostic model, and the model also possessed the best c-index.
The GBDT prediction model provides a reliable tool for the early identification of high-risk HBC patients prone to developing multiple complications. The concurrent occurrence of multiple complications is an independent prognostic factor for OS in patients with HBC. The constructed prognostic model demonstrated remarkable predictive performance and clinical applicability, indicating its crucial role in enhancing patient outcomes through timely and targeted interventions.
乙型肝炎相关性肝硬化(HBC)与严重并发症及不良临床结局相关。本研究旨在建立并验证HBC患者发生多种并发症(三种或更多)的预测模型,并探讨多种并发症对HBC预后的影响。
在这项回顾性队列研究中,分析了2009年2月至2019年11月在南京医科大学第二附属医院接受治疗的121例HBC患者的数据。最大随访期为10.75年,中位数为5.75年。采用八种机器学习技术构建预测模型,包括C5.0、线性判别分析(LDA)、最小绝对收缩和选择算子(LASSO)、k近邻(KNN)、梯度提升决策树(GBDT)、支持向量机(SVM)、广义线性模型(GLM)和朴素贝叶斯(NB),利用病史、人口统计学、临床体征和实验室检查结果等变量。通过受试者工作特征(ROC)曲线分析、残差分析、校准曲线分析和决策曲线分析(DCA)评估模型性能。通过Kaplan-Meier曲线分析评估多种并发症对HBC生存时间的影响。此外,进行LASSO以及单变量和多变量Cox回归分析,以确定HBC患者总生存(OS)的独立预后因素,随后对构建的预后列线图模型进行ROC、C指数、校准曲线和DCA曲线分析。本研究采用自助重抽样进行内部验证,并使用重症监护医学信息集市IV(MIMIC-IV)数据库进行外部验证。
GBDT模型表现出最高的曲线下面积(AUC),成为预测多种并发症发生的最佳模型。关键预测因素包括出院后发热(PHF)、体重指数(BMI)、视黄醇结合蛋白(RBP)、总胆红素(TB)水平和嗜酸性粒细胞(EOS)。Kaplan-Meier分析显示,发生多种并发症的患者的OS明显比并发症较少的患者差。此外,在最小绝对收缩和LASSO选择的基础上进行的多变量Cox回归分析确定肝细胞癌(HCC)、多种并发症和乳酸脱氢酶(LDH)水平为OS的独立预后因素。该预后模型的1年、3年和5年OS的ROC AUC分别为0.802、0.793和0.817。对于内部验证队列,相应的AUC值分别为0.797、0.832和0.835。相比之下,外部验证队列的1年ROC AUC为0.707。校准曲线表明模型具有良好的一致性,DCA证明了模型的临床实用性,在特定阈值范围内显示出较高的净效益。与单变量模型相比,多变量ROC曲线表明该预后模型的AUC值更高,并且该模型还具有最佳的C指数。
GBDT预测模型为早期识别易发生多种并发症的高危HBC患者提供了可靠工具。多种并发症的同时发生是HBC患者OS的独立预后因素。构建的预后模型表现出卓越的预测性能和临床适用性,表明其通过及时和有针对性的干预措施改善患者结局方面的关键作用。