Huang Liqiong, Luo Yu, Zhang Li, Wu Mengqi, Hu Lirong
Department of Ultrasound, Chengdu Integrated Traditional Chinese Medicine and Western Medicine Hospital, Sichuan Province, No. 18 Wanxiang North Road, High Tech Zone, Chengdu, China.
BMC Gastroenterol. 2025 Apr 14;25(1):255. doi: 10.1186/s12876-025-03850-x.
Metabolic dysfunction-associated fatty liver disease (MAFLD) is a common chronic liver disease and represents a significant public health issue. Nevertheless, current risk stratification methods remain inadequate. The study aimed to use machine learning in the identification of significant features and the development of a predictive model to determine its usefulness in discrimination of MAFLD's risk stratification (low, moderate, and high) in adults.
The data of the 2021-2023 NHANES database were analyzed. Vibration-controlled transient elastography measurements, including controlled attenuation parameter for the evaluation of steatosis and liver stiffness for the evaluation of fibrosis, were used for risk stratification. The participants were grouped into low-risk, moderate-risk, and high-risk groups based on specific criteria. Feature selection was conducted through Least Absolute Shrinkage and Selection Operator (LASSO) regression and random forest classification.
A total of 4,227 participants were included in the study. There were 16 significant predictors identified by LASSO regression, among which the top 10 predictors were demographic (age, gender, race, hypertension history), clinical (body mass index, waist circumference, hemoglobin, glycohemoglobin, lymphocyte count), and education level. The area under the receiver operating characteristic curve (AUC) of the random forest model in the validation set was 0.80, and the individual AUC was 0.83, 0.66 and 0.79 for the low-, moderate-, and high-risk groups, respectively.
Our machine learning model has excellent performance in stratification of risk for MAFLD with readily available clinical and demographic parameters. This model could be employed as a valuable screening tool to refer high-risk patients for further hepatological evaluation.
代谢功能障碍相关脂肪性肝病(MAFLD)是一种常见的慢性肝病,是一个重大的公共卫生问题。然而,目前的风险分层方法仍然不足。本研究旨在利用机器学习识别重要特征并开发预测模型,以确定其在区分成人MAFLD风险分层(低、中、高)中的作用。
分析了2021 - 2023年美国国家健康与营养检查调查(NHANES)数据库的数据。使用振动控制瞬时弹性成像测量,包括用于评估脂肪变性的受控衰减参数和用于评估纤维化的肝脏硬度,进行风险分层。根据特定标准将参与者分为低风险、中风险和高风险组。通过最小绝对收缩和选择算子(LASSO)回归和随机森林分类进行特征选择。
本研究共纳入4227名参与者。LASSO回归确定了16个重要预测因子,其中前10个预测因子为人口统计学因素(年龄、性别、种族、高血压病史)、临床因素(体重指数、腰围、血红蛋白、糖化血红蛋白、淋巴细胞计数)和教育水平。验证集中随机森林模型的受试者操作特征曲线(AUC)下面积为0.80,低、中、高风险组的个体AUC分别为0.83、0.66和0.79。
我们的机器学习模型在利用现成的临床和人口统计学参数对MAFLD风险进行分层方面具有出色的性能。该模型可作为一种有价值的筛查工具,用于将高危患者转诊进行进一步的肝病评估。