• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

结构学习算法在社会流行病学中的贡献:对真实世界数据的应用

Contribution of Structure Learning Algorithms in Social Epidemiology: Application to Real-World Data.

作者信息

Colineaux Helene, Lepage Benoit, Chauvin Pierre, Dimeglio Chloe, Delpierre Cyrille, Lefèvre Thomas

机构信息

EQUITY Team, Centre d'Epidémiologie et de Recherche en Santé des POPulations (CERPOP), Institut National de la Santé et de la Recherche Médicale (INSERM)-Toulouse III University, 37 Allées Jules Guesde, 31062 Toulouse, France.

Epidemiology Department, Toulouse Teaching Hospital, 37 Allées Jules Guesde, 31062 Toulouse, France.

出版信息

Int J Environ Res Public Health. 2025 Feb 27;22(3):348. doi: 10.3390/ijerph22030348.

DOI:10.3390/ijerph22030348
PMID:40238329
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11941975/
Abstract

Epidemiologists often handle large datasets with numerous variables and are currently seeing a growing wealth of techniques for data analysis, such as machine learning. Critical aspects involve addressing causality, often based on observational data, and dealing with the complex relationships between variables to uncover the overall structure of variable interactions, causal or not. Structure learning (SL) methods aim to automatically or semi-automatically reveal the structure of variables' relationships. The objective of this study is to delineate some of the potential contributions and limitations of structure learning methods when applied to social epidemiology topics and the search for determinants of healthcare system access. We applied SL techniques to a real-world dataset, namely the 2010 wave of the SIRS cohort, which included a sample of 3006 adults from the Paris region, France. Healthcare utilization, encompassing both direct and indirect access to care, was the primary outcome. Candidate determinants included health status, demographic characteristics, and socio-cultural and economic positions. We present two approaches: a non-automated epidemiological method (an initial expert knowledge network and stepwise logistic regression models) and three SL techniques using various algorithms, with and without knowledge constraints. We compared the results based on the presence, direction, and strength of specific links within the produced network. Although the interdependencies and relative strengths identified by both approaches were similar, the SL algorithms detect fewer associations with the outcome than the non-automated method. Relationships between variables were sometimes incorrectly oriented when using a purely data-driven approach. SL algorithms can be valuable in exploratory stages, helping to generate new hypotheses or mining novel databases. However, results should be validated against prior knowledge and supplemented with additional confirmatory analyses.

摘要

流行病学家经常处理包含众多变量的大型数据集,目前用于数据分析的技术越来越丰富,比如机器学习。关键方面包括解决因果关系(通常基于观察数据),以及处理变量之间的复杂关系,以揭示变量相互作用的整体结构,无论其是否具有因果关系。结构学习(SL)方法旨在自动或半自动地揭示变量关系的结构。本研究的目的是描述结构学习方法应用于社会流行病学主题以及寻找医疗保健系统可及性的决定因素时的一些潜在贡献和局限性。我们将SL技术应用于一个真实世界的数据集,即2010年SIRS队列研究,该研究包含来自法国巴黎地区的3006名成年人样本。医疗保健利用情况,包括直接和间接获得医疗服务,是主要结果。候选决定因素包括健康状况、人口特征以及社会文化和经济地位。我们提出两种方法:一种非自动化的流行病学方法(初始专家知识网络和逐步逻辑回归模型)以及三种使用不同算法的SL技术,有无知识约束均可。我们根据生成网络中特定链接的存在、方向和强度比较了结果。尽管两种方法确定的相互依赖性和相对强度相似,但SL算法检测到的与结果相关的关联比非自动化方法少。使用纯数据驱动方法时,变量之间的关系有时方向错误。SL算法在探索阶段可能很有价值,有助于生成新假设或挖掘新数据库。然而,结果应根据先验知识进行验证,并辅以额外的验证分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e85/11941975/b70ce36663a6/ijerph-22-00348-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e85/11941975/ab90e5f2aa25/ijerph-22-00348-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e85/11941975/390695af223c/ijerph-22-00348-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e85/11941975/b70ce36663a6/ijerph-22-00348-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e85/11941975/ab90e5f2aa25/ijerph-22-00348-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e85/11941975/390695af223c/ijerph-22-00348-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e85/11941975/b70ce36663a6/ijerph-22-00348-g003.jpg

相似文献

1
Contribution of Structure Learning Algorithms in Social Epidemiology: Application to Real-World Data.结构学习算法在社会流行病学中的贡献:对真实世界数据的应用
Int J Environ Res Public Health. 2025 Feb 27;22(3):348. doi: 10.3390/ijerph22030348.
2
An algorithm for direct causal learning of influences on patient outcomes.一种用于直接因果学习对患者预后影响的算法。
Artif Intell Med. 2017 Jan;75:1-15. doi: 10.1016/j.artmed.2016.10.003. Epub 2016 Nov 5.
3
Causal Artificial Intelligence Models of Food Quality Data.食品质量数据的因果人工智能模型。
Food Technol Biotechnol. 2024 Mar;62(1):102-109. doi: 10.17113/ftb.62.01.24.8301.
4
Real-World Evidence, Causal Inference, and Machine Learning.真实世界证据、因果推理和机器学习。
Value Health. 2019 May;22(5):587-592. doi: 10.1016/j.jval.2019.03.001.
5
Can algorithms replace expert knowledge for causal inference? A case study on novice use of causal discovery.算法能否取代专家知识进行因果推断?一项关于新手使用因果发现的案例研究。
Am J Epidemiol. 2025 May 7;194(5):1399-1409. doi: 10.1093/aje/kwae338.
6
Treatment of missing data in Bayesian network structure learning: an application to linked biomedical and social survey data.贝叶斯网络结构学习中缺失数据的处理:在链接生物医学和社会调查数据中的应用。
BMC Med Res Methodol. 2022 Dec 19;22(1):326. doi: 10.1186/s12874-022-01781-9.
7
Associations between persistent organic pollutants and endometriosis: A multipollutant assessment using machine learning algorithms.持久性有机污染物与子宫内膜异位症之间的关联:基于机器学习算法的多污染物评估。
Environ Pollut. 2020 May;260:114066. doi: 10.1016/j.envpol.2020.114066. Epub 2020 Jan 28.
8
CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling.CauRuler:用于复杂患者轨迹建模的因果非冗余关联规则挖掘器。
Comput Biol Med. 2023 Mar;155:106636. doi: 10.1016/j.compbiomed.2023.106636. Epub 2023 Feb 9.
9
Smooth Bayesian network model for the prediction of future high-cost patients with COPD.用于预测 COPD 未来高费用患者的平滑贝叶斯网络模型。
Int J Med Inform. 2019 Jun;126:147-155. doi: 10.1016/j.ijmedinf.2019.03.017. Epub 2019 Apr 4.
10
Learning Bayesian networks from demographic and health survey data.从人口与健康调查数据中学习贝叶斯网络。
J Biomed Inform. 2021 Jan;113:103588. doi: 10.1016/j.jbi.2020.103588. Epub 2020 Nov 17.

本文引用的文献

1
Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine.可扩展因果结构学习:传统与深度学习算法的综述及生物医学中的新机遇
JMIR Med Inform. 2023 Jan 17;11:e38266. doi: 10.2196/38266.
2
Causal Datasheet for Datasets: An Evaluation Guide for Real-World Data Analysis and Data Collection Design Using Bayesian Networks.数据集因果数据表:使用贝叶斯网络进行现实世界数据分析和数据收集设计的评估指南。
Front Artif Intell. 2021 Apr 14;4:612551. doi: 10.3389/frai.2021.612551. eCollection 2021.
3
Principles and Practice of Explainable Machine Learning.
可解释机器学习原理与实践
Front Big Data. 2021 Jul 1;4:688969. doi: 10.3389/fdata.2021.688969. eCollection 2021.
4
A scoping review on the use of machine learning in research on social determinants of health: Trends and research prospects.关于机器学习在健康社会决定因素研究中的应用的范围综述:趋势与研究前景
SSM Popul Health. 2021 Jun 5;15:100836. doi: 10.1016/j.ssmph.2021.100836. eCollection 2021 Sep.
5
Explainable AI: A Review of Machine Learning Interpretability Methods.可解释人工智能:机器学习可解释性方法综述
Entropy (Basel). 2020 Dec 25;23(1):18. doi: 10.3390/e23010018.
6
Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations.应用健康研究中使用有向无环图(DAG)识别混杂因素:综述与建议。
Int J Epidemiol. 2021 May 17;50(2):620-632. doi: 10.1093/ije/dyaa213.
7
Learning Bayesian networks from demographic and health survey data.从人口与健康调查数据中学习贝叶斯网络。
J Biomed Inform. 2021 Jan;113:103588. doi: 10.1016/j.jbi.2020.103588. Epub 2020 Nov 17.
8
Challenges and Opportunities with Causal Discovery Algorithms: Application to Alzheimer's Pathophysiology.因果发现算法的挑战与机遇:在阿尔茨海默病病理生理学中的应用。
Sci Rep. 2020 Feb 19;10(1):2975. doi: 10.1038/s41598-020-59669-x.
9
What is Machine Learning? A Primer for the Epidemiologist.什么是机器学习?流行病学人员入门指南。
Am J Epidemiol. 2019 Dec 31;188(12):2222-2239. doi: 10.1093/aje/kwz189.
10
Review of Causal Discovery Methods Based on Graphical Models.基于图形模型的因果发现方法综述
Front Genet. 2019 Jun 4;10:524. doi: 10.3389/fgene.2019.00524. eCollection 2019.