Yu Tianwei, Li Ker-Chau
Department of Statistics, University of California-Los Angeles, Los Angeles, CA 90095-1554, USA.
Bioinformatics. 2005 Nov 1;21(21):4033-8. doi: 10.1093/bioinformatics/bti656. Epub 2005 Sep 6.
Microarray gene expression and cross-linking chromatin immunoprecipitation data contain voluminous information that can help the identification of transcriptional regulatory networks at the full genome scale. Such high-throughput data are noisy however. In contrast, from the biomedical literature, we can find many evidenced transcription factor (TF)-target gene binding relationships that have been elucidated at the molecular level. But such sporadically generated knowledge only offers glimpses on limited patches of the network. How to incorporate this valuable knowledge resource to build more reliable network models remains a question.
We present a modified factor analysis approach. Our algorithm starts with the evidenced TF-gene linkages. It iterates between the network configuration estimation step and the connection strength estimation step, using the high-throughput data, till convergence. We report two comprehensive regulatory networks obtained for Saccharomyces cerevisiae, one under the normal growth condition and the other under the environmental stress condition.
http://kiefer.stat.ucla.edu/lap2/download/bti656_supplement.pdf.
微阵列基因表达数据和交联染色质免疫沉淀数据包含大量信息,有助于在全基因组规模上识别转录调控网络。然而,此类高通量数据存在噪声。相比之下,从生物医学文献中,我们可以找到许多在分子水平上已阐明的有证据支持的转录因子(TF)-靶基因结合关系。但这种零星产生的知识仅能让我们对网络的有限部分有所了解。如何整合这一宝贵的知识资源来构建更可靠的网络模型仍是一个问题。
我们提出了一种改进的因子分析方法。我们的算法从有证据支持的TF-基因联系开始。它在网络配置估计步骤和连接强度估计步骤之间进行迭代,利用高通量数据,直至收敛。我们报告了为酿酒酵母获得的两个综合调控网络,一个是在正常生长条件下,另一个是在环境应激条件下。
http://kiefer.stat.ucla.edu/lap2/download/bti656_supplement.pdf 。