The Institute of Scientific and Industrial Research, Osaka University, Mihogaoka 8-1, Ibaraki, Osaka 567-0047, Japan.
Neural Netw. 2011 Oct;24(8):875-80. doi: 10.1016/j.neunet.2011.05.017. Epub 2011 Jun 15.
Many statistical methods have been proposed to estimate causal models in classical situations with fewer variables than observations. However, modern datasets including gene expression data increase the needs of high-dimensional causal modeling in challenging situations with orders of magnitude more variables than observations. In this paper, we propose a method to find exogenous variables in a linear non-Gaussian causal model, which requires much smaller sample sizes than conventional methods and works even under orders of magnitude more variables than observations. Exogenous variables work as triggers that activate causal chains in the model, and their identification leads to more efficient experimental designs and better understanding of the causal mechanism. We present experiments with artificial data and real-world gene expression data to evaluate the method.
许多统计方法已经被提出,用于在变量数少于观测数的经典情况下估计因果模型。然而,现代数据集,包括基因表达数据,增加了在具有比观测数大几个数量级的变量的挑战性情况下进行高维因果建模的需求。在本文中,我们提出了一种在线性非高斯因果模型中寻找外生变量的方法,它比传统方法需要更小的样本量,甚至在比观测数大几个数量级的变量的情况下也能工作。外生变量作为触发器,激活模型中的因果链,它们的识别可以导致更有效的实验设计和对因果机制的更好理解。我们使用人工数据和真实世界的基因表达数据进行实验,以评估该方法。