基于分而治之的因果结构学习。

Learning Causal Structures Based on Divide and Conquer.

出版信息

IEEE Trans Cybern. 2022 May;52(5):3232-3243. doi: 10.1109/TCYB.2020.3010004. Epub 2022 May 19.

DOI:10.1109/TCYB.2020.3010004

Abstract

This article addresses two important issues of causal inference in the high-dimensional situation. One is how to reduce redundant conditional independence (CI) tests, which heavily impact the efficiency and accuracy of existing constraint-based methods. Another is how to construct the true causal graph from a set of Markov equivalence classes returned by these methods. For the first issue, we design a recursive decomposition approach where the original data (a set of variables) are first decomposed into two small subsets, each of which is then recursively decomposed into two smaller subsets until none of these subsets can be decomposed further. Redundant CI tests can be reduced by inferring causalities from these subsets. The advantage of this decomposition scheme lies in two aspects: 1) it requires only low-order CI tests and 2) it does not violate d -separation. The complete causality can be reconstructed by merging all the partial results of the subsets. For the second issue, we employ regression-based CI tests to check CIs in linear non-Gaussian additive noise cases, which can identify more causal directions by [Formula: see text] (or [Formula: see text]). Consequently, causal direction learning is no longer limited by the number of returned V -structures and consistent propagation. Extensive experiments show that the proposed method can not only substantially reduce redundant CI tests but also effectively distinguish the equivalence classes.

摘要

本文解决了高维情况下因果推断的两个重要问题。一个是如何减少条件独立（CI）测试的冗余，这对现有基于约束的方法的效率和准确性有很大影响。另一个是如何从这些方法返回的一组马尔可夫等价类中构建真实的因果图。对于第一个问题，我们设计了一种递归分解方法，其中原始数据（一组变量）首先被分解成两个较小的子集，然后每个子集递归地进一步分解成两个更小的子集，直到没有子集可以进一步分解。通过从这些子集中推断因果关系，可以减少冗余的 CI 测试。这种分解方案的优点在于：1）它只需要低阶的 CI 测试；2）它不违反 d-分离。通过合并所有子集的部分结果，可以重建完整的因果关系。对于第二个问题，我们在线性非高斯加性噪声情况下使用基于回归的 CI 测试来检查 CI，这可以通过 [公式：见正文]（或 [公式：见正文]）识别更多的因果方向。因此，因果方向学习不再受返回的 V-结构和一致传播的数量的限制。大量实验表明，所提出的方法不仅可以大大减少冗余的 CI 测试，而且可以有效地区分等价类。