Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, Athens, GA, USA.
BMC Plant Biol. 2012 Aug 9;12:138. doi: 10.1186/1471-2229-12-138.
Identification of the novel genes relevant to plant cell-wall (PCW) synthesis represents a highly important and challenging problem. Although substantial efforts have been invested into studying this problem, the vast majority of the PCW related genes remain unknown.
Here we present a computational study focused on identification of the novel PCW genes in Arabidopsis based on the co-expression analyses of transcriptomic data collected under 351 conditions, using a bi-clustering technique. Our analysis identified 217 highly co-expressed gene clusters (modules) under some experimental conditions, each containing at least one gene annotated as PCW related according to the Purdue Cell Wall Gene Families database. These co-expression modules cover 349 known/annotated PCW genes and 2,438 new candidates. For each candidate gene, we annotated the specific PCW synthesis stages in which it is involved and predicted the detailed function. In addition, for the co-expressed genes in each module, we predicted and analyzed their cis regulatory motifs in the promoters using our motif discovery pipeline, providing strong evidence that the genes in each co-expression module are transcriptionally co-regulated. From the all co-expression modules, we infer that 108 modules are related to four major PCW synthesis components, using three complementary methods.
We believe our approach and data presented here will be useful for further identification and characterization of PCW genes. All the predicted PCW genes, co-expression modules, motifs and their annotations are available at a web-based database: http://csbl.bmb.uga.edu/publications/materials/shanwang/CWRPdb/index.html.
鉴定与植物细胞壁(PCW)合成相关的新基因是一个非常重要且具有挑战性的问题。尽管已经投入了大量的努力来研究这个问题,但绝大多数与 PCW 相关的基因仍然未知。
在这里,我们根据在 351 种条件下收集的转录组数据进行的共表达分析,使用双聚类技术,提出了一项针对鉴定拟南芥中新型 PCW 基因的计算研究。我们的分析在某些实验条件下确定了 217 个高度共表达的基因簇(模块),每个模块至少包含一个根据普渡细胞壁基因家族数据库注释为与 PCW 相关的基因。这些共表达模块涵盖了 349 个已知/注释的 PCW 基因和 2438 个新候选基因。对于每个候选基因,我们注释了它参与的特定 PCW 合成阶段,并预测了其详细功能。此外,对于每个模块中的共表达基因,我们使用我们的基序发现管道预测和分析了它们在启动子中的顺式调控基序,为每个共表达模块中的基因是转录共调控提供了有力证据。从所有共表达模块中,我们使用三种互补的方法推断出 108 个模块与四个主要的 PCW 合成成分有关。
我们相信我们的方法和这里呈现的数据将有助于进一步鉴定和表征 PCW 基因。所有预测的 PCW 基因、共表达模块、基序及其注释均可在基于网络的数据库中获得:http://csbl.bmb.uga.edu/publications/materials/shanwang/CWRPdb/index.html。