Suppr超能文献

基于机器学习的中国高尿酸血症成年患者肾功能损害预测模型:危险因素分析

Machine learning-based prediction models for renal impairment in Chinese adults with hyperuricaemia: risk factor analysis.

作者信息

Wu Tianchen, Yang Hui, Chen Jinbin, Kong Wenwen

机构信息

Department of Neurology, Nanjing Hospital of Chinese Medicine Affiliated to Nanjing University of Chinese Medicine, Nanjing, China.

School of Nursing, Nanjing University of Chinese Medicine, Nanjing, China.

出版信息

Sci Rep. 2025 Mar 15;15(1):8968. doi: 10.1038/s41598-025-88632-x.

Abstract

In hyperuricaemic populations, multiple factors may contribute to impaired renal function. This study aimed to establish a machine learning-based model to identify characteristic factors related to renal impairment in hyperuricaemic patients, determine dose‒response relationships, and facilitate early intervention strategies. Data were collected through the big data platform of Nanjing Hospital of Traditional Chinese Medicine, encompassing 2,705 patients with hyperuricaemia (1,577 with renal impairment, 828 without) from June 2019 to June 2022. After multiple imputations for missing values, the dataset was randomly split into training (70%) and validation (30%) sets. We employed three machine learning algorithms for feature selection: random forest (with 100 decision trees and an OOB error rate of 23.34%), LASSO regression (optimal lambda of -3.58), and XGBoost (learning rate of 0.3, maximum tree depth of 1, and 50 rounds of boosting). The intersection of features identified by these algorithms through Venn diagram analysis yielded four key predictors. A logistic regression model was subsequently constructed and evaluated for discrimination (AUC), calibration (Brier score), and clinical utility (DCA). Restricted cubic spline (RCS) curves were utilized to analyse the dose‒response relationships. The model, which incorporates age, cystatin C (Cys-C), uric acid (UA), and sex, demonstrated robust performance, with an AUC of 0.818 [95% CI (0.796-0.817)] in the training set and an AUC of 0.82 [95% CI (0.787-0.853)] in the validation set. Calibration tests yielded Brier scores of 0.160 and 0.158, respectively. Clinical decision curves revealed optimal prediction probability intervals of 6-99.02% and 7-93.14%. In the hyperuricaemic population, each 0.5 mg/L increase in Cys-C, 10-year increase in age, and 100 µmol/L increase in UA corresponded to increased risks of 13%, 81%, and 73%, respectively. RCS analysis revealed nonlinear relationships for Age and Cys-C and a linear relationship for UA, with sex-specific distribution patterns. The machine learning-based model incorporating these four indicators demonstrated excellent predictive performance for renal impairment in hyperuricaemic patients. These findings suggest that monitoring Cys-C and UA levels while considering age and sex differences is crucial for risk assessment and prevention strategies.

摘要

在高尿酸血症人群中,多种因素可能导致肾功能受损。本研究旨在建立一种基于机器学习的模型,以识别高尿酸血症患者中与肾功能损害相关的特征因素,确定剂量反应关系,并促进早期干预策略。通过南京市中医院大数据平台收集数据,纳入了2019年6月至2022年6月期间的2705例高尿酸血症患者(1577例有肾功能损害,828例无肾功能损害)。对缺失值进行多次插补后,将数据集随机分为训练集(70%)和验证集(30%)。我们采用三种机器学习算法进行特征选择:随机森林(100棵决策树,袋外错误率为23.34%)、LASSO回归(最优λ为-3.58)和XGBoost(学习率为0.3,最大树深度为1,50轮提升)。通过维恩图分析,这些算法识别出的特征的交集产生了四个关键预测因子。随后构建逻辑回归模型,并对其判别能力(AUC)、校准能力(Brier评分)和临床实用性(DCA)进行评估。利用受限立方样条(RCS)曲线分析剂量反应关系。该模型纳入了年龄、胱抑素C(Cys-C)、尿酸(UA)和性别,表现出强大的性能,训练集中的AUC为0.818 [95%CI(0.796-0.817)],验证集中的AUC为0.82 [95%CI(0.787-0.853)]。校准测试的Brier评分分别为0.160和0.158。临床决策曲线显示最佳预测概率区间分别为6-99.02%和7-93.14%。在高尿酸血症人群中,Cys-C每增加0.5mg/L、年龄每增加10岁、UA每增加100µmol/L,相应的风险分别增加13%、81%和73%。RCS分析显示年龄和Cys-C呈非线性关系,UA呈线性关系,且存在性别特异性分布模式。纳入这四个指标的基于机器学习的模型对高尿酸血症患者的肾功能损害具有出色的预测性能。这些发现表明,在考虑年龄和性别差异的同时监测Cys-C和UA水平对于风险评估和预防策略至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6604/11910588/d75cd2221e91/41598_2025_88632_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验