Department of Computer Science and Engineering, Chinese University of Hong Kong, Hong Kong.
Neural Comput. 2013 Jun;25(6):1605-41. doi: 10.1162/NECO_a_00444. Epub 2013 Mar 21.
LiNGAM has been successfully applied to some real-world causal discovery problems. Nevertheless, causal sufficiency is assumed; that is, there is no latent confounder of the observations, which may be unrealistic for real-world problems. Taking into the consideration latent confounders will improve the reliability and accuracy of estimations of the real causal structures. In this letter, we investigate a model called linear nongaussian acyclic models in the presence of latent gaussian confounders (LiNGAM-GC) which can be seen as a specific case of lvLiNGAM. This model includes the latent confounders, which are assumed to be independent gaussian distributed and statistically independent of the disturbances. To tackle the causal discovery problem of this model, first we propose a pairwise cumulant-based measure of causal directions for cause-effect pairs. We prove that in spite of the presence of latent gaussian confounders, the causal direction of the observed cause-effect pair can be identified under the mild condition that the disturbances are simultaneously supergaussian or subgaussian. We propose a simple and efficient method to detect the violation of this condition. We extend our work to multivariate causal network discovery problems. Specifically we propose algorithms to estimate the causal network structure, including causal ordering and causal strengths, using an iterative root finding-removing scheme based on pairwise measure. To address the redundant edge problem due to the finite sample size effect, we develop an efficient bootstrapping-based pruning algorithm. Experiments on synthetic data and real-world data have been conducted to show the applicability of our model and the effectiveness of our proposed algorithms.
LiNGAM 已成功应用于一些实际的因果发现问题。然而,它假设了因果充分性;也就是说,没有观察到的潜在混杂因素,这对于实际问题来说可能不现实。考虑潜在混杂因素将提高真实因果结构估计的可靠性和准确性。在这封信中,我们研究了一种称为存在潜在高斯混杂因素的线性非高斯无环模型(LiNGAM-GC)的模型,它可以被视为 lvLiNGAM 的一个特例。该模型包括潜在混杂因素,这些混杂因素被假设为独立的高斯分布,与干扰项在统计上是独立的。为了解决该模型的因果发现问题,我们首先提出了一种基于成对累积量的因果方向度量方法,用于因果对。我们证明,尽管存在潜在的高斯混杂因素,但在干扰项同时是超高斯或次高斯的情况下,可以在温和的条件下识别观测到的因果对的因果方向。我们提出了一种简单有效的方法来检测违反此条件的情况。我们将我们的工作扩展到多变量因果网络发现问题。具体来说,我们提出了使用基于成对测度的迭代根查找-删除方案来估计因果网络结构的算法,包括因果顺序和因果强度。为了解决由于有限样本量效应导致的冗余边问题,我们开发了一种基于有效引导的剪枝算法。在合成数据和真实世界数据上的实验表明了我们模型的适用性和所提出算法的有效性。