利用 2003 年至 2018 年美国 NHANES 数据的机器学习方法识别重金属暴露与缺血性中风之间的关联。

Machine learning approaches to identify the link between heavy metal exposure and ischemic stroke using the US NHANES data from 2003 to 2018.

机构信息

Department of Emergency Center II, People's Hospital of Xinjiang Uygur Autonomous Region, Ürümqi, Xinjiang, China.

Department of Critical Care Medicine, The First Affiliated Hospital of Xinjiang Medical University, Ürümqi, Xinjiang, China.

出版信息

Front Public Health. 2024 Sep 16;12:1388257. doi: 10.3389/fpubh.2024.1388257. eCollection 2024.

DOI:10.3389/fpubh.2024.1388257

PMID:39351032

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11439780/

Abstract

PURPOSE

There is limited understanding of the link between exposure to heavy metals and ischemic stroke (IS). This research aimed to develop efficient and interpretable machine learning (ML) models to associate the relationship between exposure to heavy metals and IS.

METHODS

The data of this research were obtained from the National Health and Nutrition Examination Survey (US NHANES, 2003-2018) database. Seven ML models were used to identify IS caused by exposure to heavy metals. To assess the strength of the models, we employed 10-fold cross-validation, the area under the curve (AUC), F1 scores, Brier scores, Matthews correlation coefficient (MCC), precision-recall (PR) curves, and decision curve analysis (DCA) curves. Following these tests, the best-performing model was selected. Finally, the DALEX package was used for feature explanation and decision-making visualization.

RESULTS

A total of 15,575 participants were involved in this study. The best-performing ML models, which included logistic regression (LR) (AUC: 0.796) and XGBoost (AUC: 0.789), were selected. The DALEX package revealed that age, total mercury in blood, poverty-to-income ratio (PIR), and cadmium were the most significant contributors to IS in the logistic regression and XGBoost models.

CONCLUSION

The logistic regression and XGBoost models showed high efficiency, accuracy, and robustness in identifying associations between heavy metal exposure and IS in NHANES 2003-2018 participants.

摘要

目的

人们对重金属暴露与缺血性脑卒中（IS）之间的联系了解有限。本研究旨在开发高效且可解释的机器学习（ML）模型，以关联重金属暴露与 IS 之间的关系。

方法

本研究的数据来自美国国家健康与营养调查（US NHANES，2003-2018 年）数据库。使用七种 ML 模型来识别重金属暴露引起的 IS。为了评估模型的强度，我们采用了 10 折交叉验证、曲线下面积（AUC）、F1 评分、Brier 评分、马修斯相关系数（MCC）、精确召回（PR）曲线和决策曲线分析（DCA）曲线。在这些测试之后，选择了表现最好的模型。最后，使用 DALEX 包进行特征解释和决策可视化。

结果

共有 15575 名参与者参与了这项研究。表现最好的 ML 模型包括逻辑回归（LR）（AUC：0.796）和 XGBoost（AUC：0.789）。DALEX 包显示，年龄、血液总汞、贫困收入比（PIR）和镉是逻辑回归和 XGBoost 模型中导致 IS 的最重要因素。

结论

逻辑回归和 XGBoost 模型在识别 NHANES 2003-2018 年参与者中重金属暴露与 IS 之间的关联方面表现出高效、准确和稳健。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用 2003 年至 2018 年美国 NHANES 数据的机器学习方法识别重金属暴露与缺血性中风之间的关联。

Machine learning approaches to identify the link between heavy metal exposure and ischemic stroke using the US NHANES data from 2003 to 2018.

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

利用 2003 年至 2018 年美国 NHANES 数据的机器学习方法识别重金属暴露与缺血性中风之间的关联。

Machine learning approaches to identify the link between heavy metal exposure and ischemic stroke using the US NHANES data from 2003 to 2018.

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献