Suppr超能文献

在全国代表性样本中,协方差重要性在晚期肝纤维化机器学习模型中的重要性可视化

The Visualization of the Importance of Covariance Importance in a Machine Learning Model for Advanced Liver Fibrosis in a Nationally Representative Sample.

作者信息

Huang Alexander A, Huang Samuel Y

机构信息

Northwestern University Feinberg School of Medicine Chicago Illinois USA.

Virginia Commonwealth University School of Medicine Richmond Virginia USA.

出版信息

JGH Open. 2025 Jul 14;9(7):e70200. doi: 10.1002/jgh3.70200. eCollection 2025 Jul.

Abstract

INTRODUCTION

Accurate prediction of liver disease is vital for early intervention, given its potential severity. This study aims to improve the prediction of advanced liver fibrosis and investigate its associations with factors, ultimately contributing to healthier lifestyle choices and timely management of liver disease.

METHODS

This cross-sectional study included adults from the US National Health and Nutrition Examination Survey (2017-2020). Questionnaires captured demographic, dietary, exercise, and mental health information. Advanced fibrosis was defined using liver stiffness measurement (LSM) with a 9.5 kPa threshold. XGBoost, a machine learning model, predicted fibrosis, assessed using AUROC. SHAP provided visual explanations of the model's predictions and feature contributions. Model gain, cover, and frequency measured feature importance, enabling transparent, and interpretable analysis.

RESULTS

There were 6979 adults (age > 18) that were included in the study with an average age of 49.02 and 3523 (50%) female. The machine learning model had an area under the receiver operator curve of 0.885. The top eight covariates include waist circumference (gain = 0.185), GGT (gain = 0.101), platelet count (gain = 0.059), AST (gain = 0.057), weight (gain = 0.049), HDL-cholesterol (gain = 0.032), and ferritin (gain = 0.034).

CONCLUSION

In conclusion, the utilization of machine learning models proves to be highly effective in accurately predicting the risk of liver fibrosis. By considering various factors such as demographic information, laboratory results, physical examination findings, and lifestyle factors, these models successfully identify crucial risk factors associated with liver fibrosis.

摘要

引言

鉴于肝脏疾病的潜在严重性,准确预测对于早期干预至关重要。本研究旨在改进对晚期肝纤维化的预测,并调查其与各种因素的关联,最终有助于做出更健康的生活方式选择以及对肝脏疾病进行及时管理。

方法

这项横断面研究纳入了美国国家健康与营养检查调查(2017 - 2020年)中的成年人。问卷收集了人口统计学、饮食、运动和心理健康信息。使用肝脏硬度测量(LSM),以9.5 kPa阈值定义晚期纤维化。机器学习模型XGBoost预测纤维化,使用受试者工作特征曲线下面积(AUROC)进行评估。SHAP提供了模型预测和特征贡献的可视化解释。模型增益、覆盖范围和频率衡量特征重要性,实现透明且可解释的分析。

结果

共有6979名成年人(年龄>18岁)纳入研究,平均年龄为49.02岁,女性3523名(50%)。机器学习模型的受试者工作特征曲线下面积为0.885。前八个协变量包括腰围(增益 = 0.185)、γ-谷氨酰转移酶(GGT,增益 = 0.101)、血小板计数(增益 = 0.059)、谷草转氨酶(AST,增益 = 0.057)、体重(增益 = 0.049)、高密度脂蛋白胆固醇(HDL - 胆固醇,增益 = 0.032)和铁蛋白(增益 = 0.034)。

结论

总之,事实证明利用机器学习模型能非常有效地准确预测肝纤维化风险。通过考虑人口统计学信息、实验室检查结果、体格检查发现和生活方式因素等各种因素,这些模型成功识别出与肝纤维化相关的关键风险因素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f0d/12259494/1cfd8ed4b299/JGH3-9-e70200-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验