1] Department of Internal Medicine, UT Southwestern Medical Center, Dallas, Texas, USA [2] Department of Clinical Sciences, University of Texas Southwestern, Dallas, Texas, USA [3] Harold C. Simmons Cancer Center, UT Southwestern Medical Center, Dallas, Texas, USA.
Am J Gastroenterol. 2013 Nov;108(11):1723-30. doi: 10.1038/ajg.2013.332. Epub 2013 Oct 29.
Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine-learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine-learning algorithms.
We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine-learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared with the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis, and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics.
After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95% confidence interval (CI) 0.56-0.67), whereas the machine-learning algorithm had a c-statistic of 0.64 (95% CI 0.60-0.69) in the validation cohort. The HALT-C model had a c-statistic of 0.60 (95% CI 0.50-0.70) in the validation cohort and was outperformed by the machine-learning algorithm. The machine-learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (P<0.001) and integrated discrimination improvement (P=0.04).
Machine-learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC.
肝细胞癌 (HCC) 的预测模型准确性有限,且缺乏验证。机器学习算法提供了一种新的方法,可能会提高肝硬化患者 HCC 风险预测的准确性。我们的研究目的是开发和比较使用传统回归分析和机器学习算法的肝硬化患者 HCC 发展的预测模型。
我们纳入了 2004 年 1 月至 2006 年 9 月期间在密歇根大学的 442 名 A 级或 B 级肝硬化患者,并对他们进行前瞻性随访,直到 HCC 发展、肝移植、死亡或研究结束。使用回归分析和机器学习算法构建 HCC 发展预测模型,并在来自肝炎 C 抗病毒长期治疗肝硬化试验 (HALT-C) 的独立验证队列中进行测试。还将这两种模型与之前发表的 HALT-C 模型进行比较。使用接受者操作特征曲线分析评估区分度,使用净重新分类改善和综合判别改善统计评估诊断准确性。
中位随访 3.5 年后,41 名患者发生 HCC。UM 回归模型在验证队列中的 C 统计量为 0.61(95%置信区间[CI]0.56-0.67),而机器学习算法的 C 统计量为 0.64(95%CI0.60-0.69)。HALT-C 模型在验证队列中的 C 统计量为 0.60(95%CI0.50-0.70),并优于机器学习算法。作为净重新分类改善 (P<0.001) 和综合判别改善 (P=0.04) 的评估,机器学习算法具有明显更好的诊断准确性。
机器学习算法提高了肝硬化患者风险分层的准确性,并可用于准确识别发生 HCC 风险高的患者。