Suppr超能文献

使用可解释的机器学习方法预测代谢功能障碍相关脂肪性肝病

Predicting metabolic dysfunction associated steatotic liver disease using explainable machine learning methods.

作者信息

Yu Yihao, Yang Yuqi, Li Qian, Yuan Jing, Zha Yan

机构信息

Master of Finance, Australian National University, Canberra, Australia.

Department of Nephrology, Guizhou Provincial People's Hospital, Guiyang, 550002, China.

出版信息

Sci Rep. 2025 Apr 11;15(1):12382. doi: 10.1038/s41598-025-96478-6.

Abstract

Early and accurate identification of patients at high risk of metabolic dysfunction-associated steatotic liver disease (MASLD) is critical to prevent and improve prognosis potentially. We aimed to develop and validate an explainable prediction model based on machine learning (ML) approaches for MASLD among the adult population. The national cross-sectional study collected data from the National Health and Nutrition Examination Survey from 2017 to 2020, consisting of 13,436 participants, who were randomly split into 70% training, 20% internal validation, and 10% external validation cohorts. MASLD was defined based on transient elastography and cardiometabolic risk factors. With 50 medical characteristics easily obtained, six ML algorithms were used to develop prediction models. Several evaluation parameters were used to compare the predictive performance, including the area under the receiver-operating-characteristic curve (AUC) and precision-recall (P-R) curve. The recursive feature elimination method was applied to select the optimal feature subset. The Shapley Additive exPlanations method offered global and local explanations for the model. The random forest (RF) model performed best in discriminative ability among 6 ML models, and the optimal 10-feature RF model was finally chosen. The final model could accurately predict MASLD in internal and external validation cohorts (AUC: 0.928, 0.918; area under P-R curve: 0.876, 0.863, respectively). The final model performed better than each of the traditional risk indicators for MASLD. An explainable 10-feature prediction model with excellent discrimination and calibration performance was successfully developed and validated for MASLD based on clinical data easily extracted using an RF algorithm.

摘要

早期准确识别代谢功能障碍相关脂肪性肝病(MASLD)高危患者对于预防和潜在改善预后至关重要。我们旨在开发并验证一种基于机器学习(ML)方法的可解释预测模型,用于成年人群中的MASLD。这项全国性横断面研究收集了2017年至2020年国家健康与营养检查调查的数据,包括13436名参与者,他们被随机分为70%的训练队列、20%的内部验证队列和10%的外部验证队列。MASLD基于瞬时弹性成像和心脏代谢危险因素进行定义。利用50个易于获取的医学特征,使用六种ML算法开发预测模型。使用了几个评估参数来比较预测性能,包括受试者工作特征曲线下面积(AUC)和精确召回率(P-R)曲线。应用递归特征消除方法选择最佳特征子集。Shapley加法解释方法为模型提供了全局和局部解释。随机森林(RF)模型在6个ML模型中的判别能力方面表现最佳,最终选择了最优的10特征RF模型。最终模型能够在内部和外部验证队列中准确预测MASLD(AUC分别为0.928和0.918;P-R曲线下面积分别为0.876和0.863)。最终模型的表现优于MASLD的每个传统风险指标。基于使用RF算法轻松提取的临床数据,成功开发并验证了一种具有出色判别和校准性能的可解释10特征预测模型用于MASLD。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cce8/11992218/91160cd88883/41598_2025_96478_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验