Sarvestany Soren Sabet, Kwong Jeffrey C, Azhie Amirhossein, Dong Victor, Cerocchi Orlando, Ali Ahmed Fuad, Karnam Ravikiran S, Kuriry Hadi, Shengir Mohamed, Candido Elisa, Duchen Raquel, Sebastiani Giada, Patel Keyur, Goldenberg Anna, Bhat Mamatha
Department of Computer Science, University of Toronto, Toronto, ON, Canada; ICES, Toronto, ON, Canada; SickKids Research Institute, Toronto, ON, Canada.
Department of Family and Community Medicine, University of Toronto, Toronto, ON, Canada; Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada; ICES, Toronto, ON, Canada; Department of Health Protection, Public Health Ontario, ON, Canada.
Lancet Digit Health. 2022 Mar;4(3):e188-e199. doi: 10.1016/S2589-7500(21)00270-3.
Cirrhosis is the result of advanced scarring (or fibrosis) of the liver, and is often diagnosed once decompensation with associated complications has occurred. Current non-invasive tests to detect advanced liver fibrosis have limited performance, with many indeterminate classifications. We aimed to identify patients with advanced liver fibrosis of all-causes using machine learning algorithms (MLAs).
In this retrospective study of routinely collected laboratory, clinical, and demographic data, we trained six MLAs (support vector machine, random forest classifier, gradient boosting classifier, logistic regression, artificial neural network, and an ensemble of all these algorithms) to detect advanced fibrosis using 1703 liver biopsies from patients seen at the Toronto Liver Clinic (TLC) between Jan 1, 2000, and Dec 20, 2014. Performance was validated using five datasets derived from patient data provided by the TLC (n=104 patients with a biopsy sample taken between March 24, 2014, and Dec 31, 2017) and McGill University Health Centre (MUHC; n=404). Patients with decompensated cirrhosis were excluded. Performance was benchmarked against aspartate aminotransferase-to-platelet ratio index (APRI), fibrosis-4 index (FIB-4), non-alcoholic fatty liver disease fibrosis score (NFS), transient elastography, and an independent panel of five hepatology experts (MB, GS, HK, KP, and RSK). MLA performance was evaluated using the area under the receiver operating characteristic curve (AUROC) and the percentage of determinate classifications.
The best MLA was an ensemble algorithm of support vector machine, random forest classifier, gradient boosting classifier, logistic regression, and neural network algorithms, which achieved 100% determinate classifications (95% CI 100·0-100·0), an AUROC score of 0·870 (95% CI 0·797-0·931) on the TLC validation set (fibrosis stages F0 and F1 vs F4), and an AUROC of 0·716 (95% CI 0·664-0·766) on the MUHC validation set (fibrosis stages F0, F1, and F2 vs F3 and F4). The ensemble MLA outperformed all routinely used biomarkers and achieved comparable performance to hepatologists as measured by AUROC and percentage of indeterminate classifications in both the TLC validation dataset (APRI AUROC score 0·719 [95% CI 0·611-0·820], 83·7% determinate [95% CI 76·0-90·4]; FIB-4 AUROC score 0·825 [95% CI 0·730-0·912], 72·1% determinate [95% CI 63·5-80·8]) and the MUHC validation dataset (APRI AUROC score 0·618 [95% CI 0·548-0·691], 75·5% determinate [95% CI 71·5-79·2]; FIB-4 AUROC score 0·717 (95% CI 0·652-0·776), 75·5% determinate [95% CI 0·713-0·797]), and achieving only slightly lower AUROC than transient elastography (0·773 [95% CI 0·699-0·834] vs 0·826 [95% CI 0·758-0·889]).
We have shown that an ensemble MLA outperforms non-imaging-based methods in detecting advanced fibrosis across different causes of liver disease. Our MLA was superior to APRI, FIB-4, and NFS with no indeterminate classifications, while achieving performance comparable to an independent panel of experts. MLAs using routinely collected data could identify patients at high-risk of advanced hepatic fibrosis and cirrhosis among patients with chronic liver disease, allowing intervention before onset of decompensation.
Toronto General Hospital Foundation.
肝硬化是肝脏严重瘢痕形成(或纤维化)的结果,通常在出现失代偿及相关并发症后才得以诊断。目前用于检测晚期肝纤维化的非侵入性检查效果有限,存在许多不确定的分类。我们旨在使用机器学习算法(MLA)识别各种病因导致的晚期肝纤维化患者。
在这项对常规收集的实验室、临床和人口统计学数据的回顾性研究中,我们使用来自2000年1月1日至2014年12月20日在多伦多肝脏诊所(TLC)就诊患者的1703份肝活检样本,训练了六种MLA(支持向量机、随机森林分类器、梯度提升分类器、逻辑回归、人工神经网络以及所有这些算法的集成)来检测晚期纤维化。使用从TLC(n = 104例患者,于2014年3月24日至2017年12月31日采集活检样本)和麦吉尔大学健康中心(MUHC;n = 404)提供的患者数据衍生的五个数据集进行性能验证。排除失代偿期肝硬化患者。将性能与天冬氨酸氨基转移酶与血小板比值指数(APRI)、纤维化-4指数(FIB-4)、非酒精性脂肪性肝病纤维化评分(NFS)、瞬时弹性成像以及由五位肝病专家(MB、GS、HK、KP和RSK)组成的独立小组进行对比。使用受试者操作特征曲线下面积(AUROC)和确定分类的百分比来评估MLA性能。
最佳的MLA是支持向量机、随机森林分类器、梯度提升分类器、逻辑回归和神经网络算法的集成算法,在TLC验证集(纤维化阶段F0和F1与F4)上实现了100%的确定分类(95%CI 100.0 - 100.0),AUROC评分为0.870(95%CI 0.797 - 0.931),在MUHC验证集(纤维化阶段F0、F1和F2与F3和F4)上AUROC为0.716(95%CI 0.664 - 0.766)。在TLC验证数据集(APRI的AUROC评分为0.719 [95%CI 0.611 - 0.820],83.7%确定分类 [95%CI 76.0 - 90.4];FIB-4的AUROC评分为0.825 [95%CI 0.730 - 0.912],72.1%确定分类 [95%CI 63.5 - 80.8])和MUHC验证数据集(APRI的AUROC评分为0.618 [95%CI 0.548 - 0.691],75.5%确定分类 [95%CI 71.5 - 79.2];FIB-4的AUROC评分为0.717(95%CI 0.652 - 0.776),75.5%确定分类 [95%CI 0.713 - 0.797])中,集成MLA的表现优于所有常规使用的生物标志物,并且在通过AUROC和不确定分类百分比衡量时,与肝病专家的表现相当,且其AUROC仅略低于瞬时弹性成像(0.773 [95%CI 0.699 - 0.834] 对比 0.826 [95%CI 0.758 - 0.889])。
我们已经表明,集成MLA在检测不同病因肝病的晚期纤维化方面优于基于非成像的方法。我们的MLA优于APRI、FIB-4和NFS且无不确定分类,同时实现了与专家独立小组相当的性能。使用常规收集数据的MLA可以在慢性肝病患者中识别出晚期肝纤维化和肝硬化的高危患者,从而在失代偿发生前进行干预。
多伦多综合医院基金会