Department of Computer Science, Duke University, Durham, North Carolina 27708, USA.
Genome Res. 2009 Nov;19(11):2090-100. doi: 10.1101/gr.094144.109. Epub 2009 Aug 3.
Transcriptional regulation is largely enacted by transcription factors (TFs) binding DNA. Large numbers of TF binding motifs have been revealed by ChIP-chip experiments followed by computational DNA motif discovery. However, the success of motif discovery algorithms has been limited when applied to sequences bound in vivo (such as those identified by ChIP-chip) because the observed TF-DNA interactions are not necessarily direct: Some TFs predominantly associate with DNA indirectly through protein partners, while others exhibit both direct and indirect binding. Here, we present the first method for distinguishing between direct and indirect TF-DNA interactions, integrating in vivo TF binding data, in vivo nucleosome occupancy data, and motifs from in vitro protein binding microarray experiments. When applied to yeast ChIP-chip data, our method reveals that only 48% of the data sets can be readily explained by direct binding of the profiled TF, while 16% can be explained by indirect DNA binding. In the remaining 36%, none of the motifs used in our analysis was able to explain the ChIP-chip data, either because the data were too noisy or because the set of motifs was incomplete. As more in vitro TF DNA binding motifs become available, our method could be used to build a complete catalog of direct and indirect TF-DNA interactions. Our method is not restricted to yeast or to ChIP-chip data, but can be applied in any system for which both in vivo binding data and in vitro DNA binding motifs are available.
转录调控在很大程度上是通过转录因子(TFs)与 DNA 结合来实现的。大量的 TF 结合基序已经通过 ChIP-chip 实验和随后的计算 DNA 基序发现揭示出来。然而,当将 motif 发现算法应用于体内结合的序列(如 ChIP-chip 鉴定的序列)时,其成功受到了限制,因为观察到的 TF-DNA 相互作用不一定是直接的:一些 TF 主要通过蛋白质伴侣间接与 DNA 结合,而另一些 TF 则同时表现出直接和间接结合。在这里,我们提出了一种区分直接和间接 TF-DNA 相互作用的方法,该方法整合了体内 TF 结合数据、体内核小体占据数据和来自体外蛋白质结合微阵列实验的基序。当应用于酵母 ChIP-chip 数据时,我们的方法表明,只有 48%的数据集可以通过所分析的 TF 的直接结合来很好地解释,而 16%可以通过间接 DNA 结合来解释。在其余的 36%中,我们分析中使用的没有一个基序能够解释 ChIP-chip 数据,要么是因为数据太嘈杂,要么是因为基序集不完整。随着更多体外 TF-DNA 结合基序的出现,我们的方法可以用来构建直接和间接 TF-DNA 相互作用的完整目录。我们的方法不仅限于酵母或 ChIP-chip 数据,而是可以应用于任何具有体内结合数据和体外 DNA 结合基序的系统。