Graduate School of Biotechnology and Crop Biotech Institute, Kyung Hee University, Yongin, Republic of Korea.
China Tobacco Gene Research Center, Zhengzhou Tobacco Research Institute, Zhengzhou, China.
Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz061.
Transcription factors (TFs) are an important class of regulatory molecules. Despite their importance, only a small number of genes encoding TFs have been characterized in Oryza sativa (rice), often because gene duplication and functional redundancy complicate their analysis. To address this challenge, we developed a web-based tool called the Rice Transcription Factor Phylogenomics Database (RTFDB) and demonstrate its application for predicting TF function. The RTFDB hosts transcriptome and co-expression analyses. Sources include high-throughput data from oligonucleotide microarray (Affymetrix and Agilent) as well as RNA-Seq-based expression profiles. We used the RTFDB to identify tissue-specific and stress-related gene expression. Subsequently, 273 genes preferentially expressed in specific tissues or organs, 455 genes showing a differential expression pattern in response to 4 abiotic stresses, 179 genes responsive to infection of various pathogens and 512 genes showing differential accumulation in response to various hormone treatments were identified through the meta-expression analysis. Pairwise Pearson correlation coefficient analysis between paralogous genes in a phylogenetic tree was used to assess their expression collinearity and thereby provides a hint on their genetic redundancy. Integrating transcriptome with the gene evolutionary information reveals the possible functional redundancy or dominance played by paralog genes in a highly duplicated genome such as rice. With this method, we estimated a predominant role for 83.3% (65/78) of the TF or transcriptional regulator genes that had been characterized via loss-of-function studies. In this regard, the proposed method is applicable for functional studies of other plant species with annotated genome.
转录因子(TFs)是一类重要的调控分子。尽管它们很重要,但在水稻(Oryza sativa)中,只有少数编码 TFs 的基因被鉴定出来,这通常是因为基因复制和功能冗余使它们的分析变得复杂。为了解决这个挑战,我们开发了一个名为 Rice Transcription Factor Phylogenomics Database(RTFDB)的基于网络的工具,并展示了它在预测 TF 功能方面的应用。RTFDB 提供了转录组和共表达分析。其来源包括寡核苷酸微阵列(Affymetrix 和 Agilent)的高通量数据以及基于 RNA-Seq 的表达谱。我们使用 RTFDB 来鉴定组织特异性和应激相关的基因表达。随后,通过元表达分析,鉴定出了 273 个在特定组织或器官中优先表达的基因、455 个对 4 种非生物胁迫表现出差异表达模式的基因、179 个对各种病原体感染有反应的基因和 512 个对各种激素处理表现出差异积累的基因。通过对系统发育树中旁系同源基因之间的成对 Pearson 相关系数分析,评估它们的表达共线性,从而提示它们的遗传冗余。将转录组与基因进化信息整合,可以揭示在高度重复的基因组(如水稻)中,旁系同源基因的可能功能冗余或优势。通过这种方法,我们估计了通过功能丧失研究已经鉴定出来的 78 个 TF 或转录调控因子基因中的 83.3%(65/78)起主要作用。在这方面,该方法适用于具有注释基因组的其他植物物种的功能研究。