Wille Anja, Bühlmann Peter
Stat Appl Genet Mol Biol. 2006;5:Article1. doi: 10.2202/1544-6115.1170. Epub 2006 Jan 4.
As a powerful tool for analyzing full conditional (in-)dependencies between random variables, graphical models have become increasingly popular to infer genetic networks based on gene expression data. However, full (unconstrained) conditional relationships between random variables can be only estimated accurately if the number of observations is relatively large in comparison to the number of variables, which is usually not fulfilled for high-throughput genomic data. Recently, simplified graphical modeling approaches have been proposed to determine dependencies between gene expression profiles. For sparse graphical models such as genetic networks, it is assumed that the zero- and first-order conditional independencies still reflect reasonably well the full conditional independence structure between variables. Moreover, low-order conditional independencies have the advantage that they can be accurately estimated even when having only a small number of observations. Therefore, using only zero- and first-order conditional dependencies to infer the complete graphical model can be very useful. Here, we analyze the statistical and probabilistic properties of these low-order conditional independence graphs (called 0-1 graphs). We find that for faithful graphical models, the 0-1 graph contains at least all edges of the full conditional independence graph (concentration graph). For simple structures such as Markov trees, the 0-1 graph even coincides with the concentration graph. Furthermore, we present some asymptotic results and we demonstrate in a simulation study that despite their simplicity, 0-1 graphs are generally good estimators of sparse graphical models. Finally, the biological relevance of some applications is summarized.
作为分析随机变量之间完全条件(非)依赖性的强大工具,图形模型在基于基因表达数据推断遗传网络方面越来越受欢迎。然而,只有当观测值的数量与变量数量相比相对较大时,随机变量之间的完全(无约束)条件关系才能准确估计,而高通量基因组数据通常不满足这一条件。最近,有人提出了简化的图形建模方法来确定基因表达谱之间的依赖性。对于像遗传网络这样的稀疏图形模型,假定零阶和一阶条件独立性仍能较好地反映变量之间的完全条件独立性结构。此外,低阶条件独立性具有即使观测值数量很少也能准确估计的优点。因此,仅使用零阶和一阶条件依赖性来推断完整的图形模型可能非常有用。在此,我们分析了这些低阶条件独立性图(称为0-1图)的统计和概率性质。我们发现,对于忠实的图形模型,0-1图至少包含完全条件独立性图(浓缩图)的所有边。对于像马尔可夫树这样的简单结构,0-1图甚至与浓缩图重合。此外,我们给出了一些渐近结果,并在模拟研究中表明,尽管0-1图很简单,但它们通常是稀疏图形模型的良好估计器。最后,总结了一些应用的生物学相关性。