• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DAGAF:用于联合结构学习和表格数据合成的有向无环生成对抗框架。

DAGAF: A directed acyclic generative adversarial framework for joint structure learning and tabular data synthesis.

作者信息

Petkov Hristo, MacLellan Calum, Dong Feng

机构信息

Department of Computer and Information Sciences, University of Strathclyde, 16 Richmond Street, Glasgow, Lanarkshire G1 1XQ United Kingdom.

出版信息

Appl Intell (Dordr). 2025;55(7):602. doi: 10.1007/s10489-025-06410-8. Epub 2025 Mar 31.

DOI:10.1007/s10489-025-06410-8
PMID:40176989
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11958450/
Abstract

Understanding the causal relationships between data variables can provide crucial insights into the construction of tabular datasets. Most existing causality learning methods typically focus on applying a single identifiable causal model, such as the Additive Noise Model (ANM) or the Linear non-Gaussian Acyclic Model (LiNGAM), to discover the dependencies exhibited in observational data. We improve on this approach by introducing a novel dual-step framework capable of performing both causal structure learning and tabular data synthesis under multiple causal model assumptions. Our approach uses Directed Acyclic Graphs (DAG) to represent causal relationships among data variables. By applying various functional causal models including ANM, LiNGAM and the Post-Nonlinear model (PNL), we implicitly learn the contents of DAG to simulate the generative process of observational data, effectively replicating the real data distribution. This is supported by a theoretical analysis to explain the multiple loss terms comprising the objective function of the framework. Experimental results demonstrate that DAGAF outperforms many existing methods in structure learning, achieving significantly lower Structural Hamming Distance (SHD) scores across both real-world and benchmark datasets (Sachs: 47%, Child: 11%, Hailfinder: 5%, Pathfinder: 7% improvement compared to state-of-the-art), while being able to produce diverse, high-quality samples.

摘要

理解数据变量之间的因果关系能够为表格数据集的构建提供至关重要的见解。大多数现有的因果关系学习方法通常专注于应用单一可识别的因果模型,比如加性噪声模型(ANM)或线性非高斯无环模型(LiNGAM),来发现观测数据中呈现的依赖性。我们通过引入一种新颖的双步框架改进了这种方法,该框架能够在多个因果模型假设下执行因果结构学习和表格数据合成。我们的方法使用有向无环图(DAG)来表示数据变量之间的因果关系。通过应用包括ANM、LiNGAM和后非线性模型(PNL)在内的各种功能因果模型,我们隐式地学习DAG的内容以模拟观测数据的生成过程,有效地复制真实数据分布。这得到了理论分析的支持,该分析解释了构成框架目标函数的多个损失项。实验结果表明,在结构学习方面,DAGAF优于许多现有方法,在真实世界和基准数据集上均实现了显著更低的结构汉明距离(SHD)分数(与最先进方法相比,在Sachs数据集上提升了47%,在Child数据集上提升了11%,在Hailfinder数据集上提升了5%,在Pathfinder数据集上提升了7%),同时能够生成多样的高质量样本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/9716943ea9ce/10489_2025_6410_Fig9a_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/1053e4cc2d73/10489_2025_6410_Figd_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/9f0323924608/10489_2025_6410_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/b6508a43e102/10489_2025_6410_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/159be1a89b0d/10489_2025_6410_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/f718924a7eb2/10489_2025_6410_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/18c94052a777/10489_2025_6410_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/51aa56c36c8f/10489_2025_6410_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/07bc79a9e42c/10489_2025_6410_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/0a35edbe947e/10489_2025_6410_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/9716943ea9ce/10489_2025_6410_Fig9a_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/1053e4cc2d73/10489_2025_6410_Figd_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/9f0323924608/10489_2025_6410_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/b6508a43e102/10489_2025_6410_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/159be1a89b0d/10489_2025_6410_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/f718924a7eb2/10489_2025_6410_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/18c94052a777/10489_2025_6410_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/51aa56c36c8f/10489_2025_6410_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/07bc79a9e42c/10489_2025_6410_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/0a35edbe947e/10489_2025_6410_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cf1/11958450/9716943ea9ce/10489_2025_6410_Fig9a_HTML.jpg

相似文献

1
DAGAF: A directed acyclic generative adversarial framework for joint structure learning and tabular data synthesis.DAGAF:用于联合结构学习和表格数据合成的有向无环生成对抗框架。
Appl Intell (Dordr). 2025;55(7):602. doi: 10.1007/s10489-025-06410-8. Epub 2025 Mar 31.
2
A multivariate additive noise model for complete causal discovery.一种完整因果发现的多元加性噪声模型。
Neural Netw. 2018 Jul;103:44-54. doi: 10.1016/j.neunet.2018.03.013. Epub 2018 Mar 26.
3
Non-Gaussian Methods for Causal Structure Learning.非高斯方法在因果结构学习中的应用。
Prev Sci. 2019 Apr;20(3):431-441. doi: 10.1007/s11121-018-0901-x.
4
Causal Artificial Intelligence Models of Food Quality Data.食品质量数据的因果人工智能模型。
Food Technol Biotechnol. 2024 Mar;62(1):102-109. doi: 10.17113/ftb.62.01.24.8301.
5
Utility-based Analysis of Statistical Approaches and Deep Learning Models for Synthetic Data Generation With Focus on Correlation Structures: Algorithm Development and Validation.基于效用的统计方法和深度学习模型用于合成数据生成的分析,重点关注相关结构:算法开发与验证
JMIR AI. 2025 Mar 20;4:e65729. doi: 10.2196/65729.
6
Structural factor equation models for causal network construction via directed acyclic mixed graphs.基于有向无环混合图的因果网络构建的结构因子方程模型。
Biometrics. 2021 Jun;77(2):573-586. doi: 10.1111/biom.13322. Epub 2020 Jul 18.
7
Testing Directed Acyclic Graph via Structural, Supervised and Generative Adversarial Learning.通过结构、监督和生成对抗学习测试有向无环图
J Am Stat Assoc. 2024;119(547):1833-1846. doi: 10.1080/01621459.2023.2220169. Epub 2023 Jul 12.
8
Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine.可扩展因果结构学习:传统与深度学习算法的综述及生物医学中的新机遇
JMIR Med Inform. 2023 Jan 17;11:e38266. doi: 10.2196/38266.
9
Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational Data.利用混合效应结构方程模型从观测数据中学习特定主题的有向无环图
Front Genet. 2018 Oct 2;9:430. doi: 10.3389/fgene.2018.00430. eCollection 2018.
10
On the Role of Entropy-Based Loss for Learning Causal Structure With Continuous Optimization.基于熵的损失在连续优化学习因果结构中的作用
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):1594-1608. doi: 10.1109/TNNLS.2023.3327357. Epub 2025 Jan 7.

本文引用的文献

1
Diffusion-Based Causal Representation Learning.基于扩散的因果表征学习
Entropy (Basel). 2024 Jun 28;26(7):556. doi: 10.3390/e26070556.
2
Challenges and Opportunities with Causal Discovery Algorithms: Application to Alzheimer's Pathophysiology.因果发现算法的挑战与机遇:在阿尔茨海默病病理生理学中的应用。
Sci Rep. 2020 Feb 19;10(1):2975. doi: 10.1038/s41598-020-59669-x.
3
Inferring causation from time series in Earth system sciences.从地球系统科学中的时间序列推断因果关系。
Nat Commun. 2019 Jun 14;10(1):2553. doi: 10.1038/s41467-019-10105-3.
4
Principal component analysis: a review and recent developments.主成分分析:综述与最新进展
Philos Trans A Math Phys Eng Sci. 2016 Apr 13;374(2065):20150202. doi: 10.1098/rsta.2015.0202.
5
From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data.从相关网络到因果网络:一种简单的近似学习算法及其在高维植物基因表达数据中的应用。
BMC Syst Biol. 2007 Aug 6;1:37. doi: 10.1186/1752-0509-1-37.
6
A new method for detecting causality in fMRI data of cognitive processing.一种检测认知加工功能磁共振成像数据中因果关系的新方法。
Cogn Process. 2006 Mar;7(1):42-52. doi: 10.1007/s10339-005-0019-5. Epub 2005 Oct 27.
7
Causal protein-signaling networks derived from multiparameter single-cell data.源自多参数单细胞数据的因果蛋白信号网络。
Science. 2005 Apr 22;308(5721):523-9. doi: 10.1126/science.1105809.