InfoBoss Inc., Gangnam-gu, Seoul, Republic of Korea.
InfoBoss Research Center, Gangnam-gu, Seoul, Republic of Korea.
PLoS One. 2021 May 26;16(5):e0252181. doi: 10.1371/journal.pone.0252181. eCollection 2021.
GATA transcription factors (TFs) are widespread eukaryotic regulators whose DNA-binding domain is a class IV zinc finger motif (CX2CX17-20CX2C) followed by a basic region. Due to the low cost of genome sequencing, multiple strains of specific species have been sequenced: e.g., number of plant genomes in the Plant Genome Database (http://www.plantgenome.info/) is 2,174 originated from 713 plant species. Thus, we investigated GATA TFs of 19 Arabidopsis thaliana genome-widely to understand intraspecific features of Arabidopsis GATA TFs with the pipeline of GATA database (http://gata.genefamily.info/). Numbers of GATA genes and GATA TFs of each A. thaliana genome range from 29 to 30 and from 39 to 42, respectively. Four cases of different pattern of alternative splicing forms of GATA genes among 19 A. thaliana genomes are identified. 22 of 2,195 amino acids (1.002%) from the alignment of GATA domain amino acid sequences display variations across 19 ecotype genomes. In addition, maximally four different amino acid sequences per each GATA domain identified in this study indicate that these position-specific amino acid variations may invoke intraspecific functional variations. Among 15 functionally characterized GATA genes, only five GATA genes display variations of amino acids across ecotypes of A. thaliana, implying variations of their biological roles across natural isolates of A. thaliana. PCA results from 28 characteristics of GATA genes display the four groups, same to those defined by the number of GATA genes. Topologies of bootstrapped phylogenetic trees of Arabidopsis chloroplasts and common GATA genes are mostly incongruent. Moreover, no relationship between geographical distribution and their phylogenetic relationships was found. Our results present that intraspecific variations of GATA TFs in A. thaliana are conserved and evolutionarily neutral along with 19 ecotypes, which is congruent to the fact that GATA TFs are one of the main regulators for controlling essential mechanisms, such as seed germination and hypocotyl elongation.
GATA 转录因子(TFs)是广泛存在的真核生物调节剂,其 DNA 结合域是一个 IV 类锌指基序(CX2CX17-20CX2C),后面跟着一个碱性区域。由于基因组测序成本低廉,已经对特定物种的多个菌株进行了测序:例如,植物基因组数据库(http://www.plantgenome.info/)中的植物基因组数量为 2174 个,源自 713 种植物。因此,我们研究了 19 种拟南芥基因组中的 GATA TFs,通过 GATA 数据库(http://gata.genefamily.info/)的管道来了解拟南芥 GATA TFs 的种内特征。每个拟南芥基因组中的 GATA 基因和 GATA TFs 的数量分别从 29 到 30 和从 39 到 42 不等。在 19 种拟南芥基因组中,有 4 种不同的 GATA 基因选择性剪接形式。在 19 种生态型基因组中,从 GATA 结构域氨基酸序列比对中显示出 22 个氨基酸(1.002%)发生了变化。此外,在本研究中,每个 GATA 结构域最多有 4 种不同的氨基酸序列,表明这些位置特异性的氨基酸变化可能引起种内功能变化。在 15 个具有功能特征的 GATA 基因中,只有 5 个 GATA 基因在拟南芥的生态型中显示出氨基酸的变化,这表明它们在拟南芥的自然分离株中具有不同的生物学功能。基于 28 个 GATA 基因特征的 PCA 结果显示了 4 个组,与 GATA 基因数量定义的组相同。拟南芥叶绿体和常见 GATA 基因的系统发育树拓扑结构大多不一致。此外,没有发现地理分布与系统发育关系之间的关系。我们的研究结果表明,拟南芥中 GATA TFs 的种内变异是保守的,并且在 19 个生态型中是进化上中性的,这与 GATA TFs 是控制种子萌发和下胚轴伸长等基本机制的主要调节剂之一的事实一致。