机器学习模型预测增生性狼疮肾炎的风险。

Machine learning models predicts risk of proliferative lupus nephritis.

机构信息

Department of Laboratory Medicine, West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China.

Jintang First People's Hospital, Chengdu, China.

出版信息

Front Immunol. 2024 Jun 11;15:1413569. doi: 10.3389/fimmu.2024.1413569. eCollection 2024.

DOI:10.3389/fimmu.2024.1413569

PMID:38919623

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11196753/

Abstract

OBJECTIVE

This study aims to develop and validate machine learning models to predict proliferative lupus nephritis (PLN) occurrence, offering a reliable diagnostic alternative when renal biopsy is not feasible or safe.

METHODS

This study retrospectively analyzed clinical and laboratory data from patients diagnosed with SLE and renal involvement who underwent renal biopsy at West China Hospital of Sichuan University between 2011 and 2021. We randomly assigned 70% of the patients to a training cohort and the remaining 30% to a test cohort. Various machine learning models were constructed on the training cohort, including generalized linear models (e.g., logistic regression, least absolute shrinkage and selection operator, ridge regression, and elastic net), support vector machines (linear and radial basis kernel functions), and decision tree models (e.g., classical decision tree, conditional inference tree, and random forest). Diagnostic performance was evaluated using ROC curves, calibration curves, and DCA for both cohorts. Furthermore, different machine learning models were compared to identify key and shared features, aiming to screen for potential PLN diagnostic markers.

RESULTS

Involving 1312 LN patients, with 780 PLN/NPLN cases analyzed. They were randomly divided into a training group (547 cases) and a testing group (233 cases). we developed nine machine learning models in the training group. Seven models demonstrated excellent discriminatory abilities in the testing cohort, random forest model showed the highest discriminatory ability (AUC: 0.880, 95% confidence interval(CI): 0.835-0.926). Logistic regression had the best calibration, while random forest exhibited the greatest clinical net benefit. By comparing features across various models, we confirmed the efficacy of traditional indicators like anti-dsDNA antibodies, complement levels, serum creatinine, and urinary red and white blood cells in predicting and distinguishing PLN. Additionally, we uncovered the potential value of previously controversial or underutilized indicators such as serum chloride, neutrophil percentage, serum cystatin C, hematocrit, urinary pH, blood routine red blood cells, and immunoglobulin M in predicting PLN.

CONCLUSION

This study provides a comprehensive perspective on incorporating a broader range of biomarkers for diagnosing and predicting PLN. Additionally, it offers an ideal non-invasive diagnostic tool for SLE patients unable to undergo renal biopsy.

摘要

目的

本研究旨在开发和验证机器学习模型，以预测增生性狼疮肾炎（PLN）的发生，为无法或不适合进行肾活检的患者提供可靠的诊断替代方法。

方法

本研究回顾性分析了 2011 年至 2021 年期间在四川大学华西医院行肾活检诊断为系统性红斑狼疮（SLE）合并肾损害的患者的临床和实验室数据。将 70%的患者随机分配到训练队列，其余 30%的患者分配到测试队列。在训练队列上构建了各种机器学习模型，包括广义线性模型（如逻辑回归、最小绝对收缩和选择算子、岭回归和弹性网）、支持向量机（线性和径向基核函数）和决策树模型（如经典决策树、条件推断树和随机森林）。使用 ROC 曲线、校准曲线和 DCA 对两个队列进行了诊断性能评估。此外，还比较了不同的机器学习模型，以识别关键和共享特征，旨在筛选潜在的 PLN 诊断标志物。

结果

共纳入 1312 例 LN 患者，其中 780 例为 PLN/NPLN 患者。将患者随机分为训练组（547 例）和测试组（233 例）。在训练组中，我们开发了 9 种机器学习模型。其中 7 种模型在测试队列中表现出良好的鉴别能力，随机森林模型表现出最高的鉴别能力（AUC：0.880，95%置信区间（CI）：0.835-0.926）。逻辑回归的校准效果最好，而随机森林的临床净获益最大。通过比较不同模型的特征，我们证实了抗 dsDNA 抗体、补体水平、血清肌酐、尿红细胞和白细胞等传统指标在预测和区分 PLN 方面的有效性。此外，我们还发现了血清氯、中性粒细胞百分比、血清胱抑素 C、红细胞压积、尿 pH 值、血常规红细胞和免疫球蛋白 M 等先前有争议或利用不足的指标在预测 PLN 方面的潜在价值。