• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

探索可解释的机器学习算法,以对2018年至2023年撒哈拉以南非洲男性烟草使用的预测因素进行建模。

Exploring explainable machine learning algorithms to model predictors of tobacco use among men in Sub Sahara Africa between 2018 and 2023.

作者信息

Melaku Mequannent Sharew, Baykemagn Nebebe Demis, Yohannes Lamrot, Zegeye Adem Tsegaw

机构信息

Department of Health Informatics, Institute of Public Health, University of Gondar, Gondar, Ethiopia.

Department of Environmental and Occupational Health and Safety, Institute of Public Health, College of Medicine and Health Science, University of Gondar, Gondar, Ethiopia.

出版信息

Sci Rep. 2025 Jul 9;15(1):24646. doi: 10.1038/s41598-025-09380-6.

DOI:10.1038/s41598-025-09380-6
PMID:40634414
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12241580/
Abstract

Tobacco smoking is a significant public health issue in sub-Saharan Africa, with its prevalence shaped by various demographic factors. This study aimed to model predictors of tobacco use among men in Sub Sahara Africa between 2018 and 2023 using machine learning algorithms. Data from Demographic and Health Surveys covering 147,466 men were analyzed. STATA version 17 was used for data cleaning and descriptive statistics, while Python 3.9 was employed for machine learning predictions. The study utilized several machine learning models, including Decision Tree, Logistic Regression, Random Forest, KNN, eXtreme Gradient Boosting (XGBoost), and AdaBoost, to identify the key predictors of tobacco use among men. Hyperparameter optimization was performed using Randomized Search with tenfold cross-validation, enhancing model performance. The Additive Explanations (SHAP) method was used to assess predictor significance. Model performance was evaluated based on accuracy, precision, recall, F1 score, and area under the curve (AUC). The study found a pooled tobacco use prevalence of 14.73%, with no significant variation between countries. High tobacco use was observed in Mozambique, Zambia, Benin, Mali, Mauritania, Senegal, Guinea, Sierra Leone, and Liberia, with Tanzania, Benin, and Senegal reporting the highest rates. The XGBoost algorithm attained an accuracy of 98% and an AUC score of 97%. SHAP analysis revealed that age, education, wealth index, religion, residence, internet use, occupation, age at first sex, number of sexual partners, and marital status were key predictors. These findings underscore the need for targeted public health interventions and highlight the value of machine learning in identifying at-risk populations and addressing socio-cultural and economic factors influencing tobacco use.

摘要

吸烟是撒哈拉以南非洲地区一个重大的公共卫生问题,其流行程度受多种人口因素影响。本研究旨在使用机器学习算法对2018年至2023年撒哈拉以南非洲地区男性烟草使用的预测因素进行建模。分析了来自人口与健康调查的147466名男性的数据。使用STATA 17版本进行数据清理和描述性统计,而使用Python 3.9进行机器学习预测。该研究利用了多种机器学习模型,包括决策树、逻辑回归、随机森林、K近邻、极端梯度提升(XGBoost)和自适应增强(AdaBoost),以确定男性烟草使用的关键预测因素。使用随机搜索和十折交叉验证进行超参数优化,提高了模型性能。使用加法解释(SHAP)方法评估预测因素的重要性。基于准确率、精确率、召回率、F1分数和曲线下面积(AUC)评估模型性能。研究发现,合并后的烟草使用流行率为14.73%,各国之间没有显著差异。在莫桑比克、赞比亚、贝宁、马里、毛里塔尼亚、塞内加尔、几内亚、塞拉利昂和利比里亚观察到高烟草使用率,其中坦桑尼亚、贝宁和塞内加尔报告的使用率最高。XGBoost算法的准确率达到98%,AUC分数为97%。SHAP分析表明,年龄、教育程度、财富指数、宗教、居住地、互联网使用、职业、首次性行为年龄、性伴侣数量和婚姻状况是关键预测因素。这些发现强调了有针对性的公共卫生干预措施的必要性,并突出了机器学习在识别高危人群以及解决影响烟草使用的社会文化和经济因素方面的价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/a06424afc889/41598_2025_9380_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/0eba78c542e5/41598_2025_9380_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/f871b9a4e10d/41598_2025_9380_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/8174d706bcf1/41598_2025_9380_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/936bc4724551/41598_2025_9380_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/eec4c0b53e3a/41598_2025_9380_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/f9d155e665b4/41598_2025_9380_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/819072839d93/41598_2025_9380_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/57d09bc5a222/41598_2025_9380_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/5754fb5e73b6/41598_2025_9380_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/f9a0f2660ad3/41598_2025_9380_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/a06424afc889/41598_2025_9380_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/0eba78c542e5/41598_2025_9380_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/f871b9a4e10d/41598_2025_9380_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/8174d706bcf1/41598_2025_9380_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/936bc4724551/41598_2025_9380_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/eec4c0b53e3a/41598_2025_9380_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/f9d155e665b4/41598_2025_9380_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/819072839d93/41598_2025_9380_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/57d09bc5a222/41598_2025_9380_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/5754fb5e73b6/41598_2025_9380_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/f9a0f2660ad3/41598_2025_9380_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a19f/12241580/a06424afc889/41598_2025_9380_Fig11_HTML.jpg

相似文献

1
Exploring explainable machine learning algorithms to model predictors of tobacco use among men in Sub Sahara Africa between 2018 and 2023.探索可解释的机器学习算法,以对2018年至2023年撒哈拉以南非洲男性烟草使用的预测因素进行建模。
Sci Rep. 2025 Jul 9;15(1):24646. doi: 10.1038/s41598-025-09380-6.
2
Application of machine learning algorithms to model predictors of informed contraceptive choice among reproductive age women in six high fertility rate sub Sahara Africa countries.机器学习算法在撒哈拉以南非洲六个高生育率国家的育龄妇女中用于构建知情避孕选择预测模型的应用。
BMC Public Health. 2025 May 29;25(1):1986. doi: 10.1186/s12889-025-23242-w.
3
Construction and validation of HBV-ACLF bacterial infection diagnosis model based on machine learning.基于机器学习的HBV-ACLF细菌感染诊断模型的构建与验证
BMC Infect Dis. 2025 Jul 1;25(1):847. doi: 10.1186/s12879-025-11199-5.
4
Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.用于预测脓毒症患者脓毒症相关肝损伤的监督式机器学习模型:基于多中心队列研究的开发与验证研究
J Med Internet Res. 2025 May 26;27:e66733. doi: 10.2196/66733.
5
Domestic violence and its determinants among reproductive-age women in Sub-saharan Africa: a multilevel analysis of 2019-2024 demographic and health survey data.撒哈拉以南非洲育龄妇女中的家庭暴力及其决定因素:对2019 - 2024年人口与健康调查数据的多层次分析
BMC Public Health. 2025 Jul 2;25(1):2288. doi: 10.1186/s12889-025-23544-z.
6
Prediction of Insulin Resistance in Nondiabetic Population Using LightGBM and Cohort Validation of Its Clinical Value: Cross-Sectional and Retrospective Cohort Study.使用LightGBM预测非糖尿病人群的胰岛素抵抗及其临床价值的队列验证:横断面和回顾性队列研究
JMIR Med Inform. 2025 Jun 13;13:e72238. doi: 10.2196/72238.
7
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
8
Prediction of caesarean section birth using machine learning algorithms among pregnant women in a district hospital in Ghana.在加纳一家区级医院的孕妇中使用机器学习算法预测剖宫产分娩
BMC Pregnancy Childbirth. 2025 Jul 2;25(1):690. doi: 10.1186/s12884-025-07716-8.
9
Tobacco packaging design for reducing tobacco use.用于减少烟草使用的烟草包装设计。
Cochrane Database Syst Rev. 2017 Apr 27;4(4):CD011244. doi: 10.1002/14651858.CD011244.pub2.
10
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

本文引用的文献

1
Frequency of cigarette smoking and its associated factors among men in East Africa: a pooled prevalence analysis of national survey using multinomial regression.东非男性吸烟频率及其相关因素:使用多项回归的全国调查汇总患病率分析。
BMC Public Health. 2024 Mar 1;24(1):668. doi: 10.1186/s12889-024-18188-4.
2
Interpretable machine learning with tree-based shapley additive explanations: Application to metabolomics datasets for binary classification.基于树的 Shapley 加性解释的可解释机器学习:在代谢组学数据集的二元分类中的应用。
PLoS One. 2023 May 4;18(5):e0284315. doi: 10.1371/journal.pone.0284315. eCollection 2023.
3
Prevalence of and factors associated with tobacco smoking in the Gambia: a national cross-sectional study.
冈比亚全国横断面研究:吸烟的流行情况及相关因素分析。
BMJ Open. 2022 Jun 13;12(6):e057607. doi: 10.1136/bmjopen-2021-057607.
4
Global Trends in Death, Years of Life Lost, and Years Lived With Disability Caused by Breast Cancer Attributable to Secondhand Smoke From 1990 to 2019.1990年至2019年因二手烟导致的乳腺癌所致死亡、寿命损失年数和残疾生存年数的全球趋势。
Front Oncol. 2022 Mar 29;12:853038. doi: 10.3389/fonc.2022.853038. eCollection 2022.
5
Tobacco use prevalence and its determinate factor in Ethiopia- finding of the 2016 Ethiopian GATS.烟草使用流行率及其在埃塞俄比亚的决定因素——2016 年埃塞俄比亚全球成人烟草调查结果。
BMC Public Health. 2022 Mar 21;22(1):555. doi: 10.1186/s12889-022-12893-8.
6
Trends in Prevalence of Tobacco Use by Sex and Socioeconomic Status in 22 Sub-Saharan African Countries, 2003-2019.2003-2019 年 22 个撒哈拉以南非洲国家按性别和社会经济地位划分的烟草使用流行趋势。
JAMA Netw Open. 2021 Dec 1;4(12):e2137820. doi: 10.1001/jamanetworkopen.2021.37820.
7
Determinants of smokeless tobacco use and prevalence among Sudanese adolescents.苏丹青少年无烟烟草使用情况及其流行率的决定因素。
Arch Public Health. 2021 Oct 12;79(1):176. doi: 10.1186/s13690-021-00699-w.
8
Prevalence and socio-demographic correlates of tobacco and alcohol use in four sub-Saharan African countries: a cross-sectional study of middle-aged adults.撒哈拉以南非洲四个国家烟草和酒精使用的患病率及社会人口学相关因素:一项针对中年成年人的横断面研究
BMC Public Health. 2021 Jun 12;21(1):1126. doi: 10.1186/s12889-021-11084-1.
9
Tobacco use and associated factors among adults reside in Arba Minch health and demographic surveillance site, southern Ethiopia: a cross-sectional study.成人烟草使用及相关因素在埃塞俄比亚南部阿尔巴明奇卫生和人口监测点的横断面研究。
BMC Public Health. 2021 Mar 4;21(1):441. doi: 10.1186/s12889-021-10479-4.
10
Prevalence, patterns and correlates of smokeless tobacco use in Nigerian adults: An analysis of the Global Adult Tobacco Survey.尼日利亚成年人使用无烟烟草的流行率、模式和相关因素:全球成人烟草调查分析。
PLoS One. 2021 Jan 6;16(1):e0245114. doi: 10.1371/journal.pone.0245114. eCollection 2021.