USDA-ARS Corn Insects and Crop Genetics Research Unit, Iowa State University, Ames, IA 50011, USA; Department of Computer Science, Iowa State University, Ames, IA 50011, USA.
Institute of Molecular Biophysics, Florida State University, Tallahassee, FL 32306-4370, USA.
J Genet Genomics. 2014 Dec 20;41(12):627-47. doi: 10.1016/j.jgg.2014.10.004. Epub 2014 Nov 4.
The G-quadruplex (G4) elements comprise a class of nucleic acid structures formed by stacking of guanine base quartets in a quadruple helix. This G4 DNA can form within or across single-stranded DNA molecules and is mutually exclusive with duplex B-form DNA. The reversibility and structural diversity of G4s make them highly versatile genetic structures, as demonstrated by their roles in various functions including telomere metabolism, genome maintenance, immunoglobulin gene diversification, transcription, and translation. Sequence motifs capable of forming G4 DNA are typically located in telomere repeat DNA and other non-telomeric genomic loci. To investigate their potential roles in a large-genome model plant species, we computationally identified 149,988 non-telomeric G4 motifs in maize (Zea mays L., B73 AGPv2), 29% of which were in non-repetitive genomic regions. G4 motif hotspots exhibited non-random enrichment in genes at two locations on the antisense strand, one in the 5' UTR and the other at the 5' end of the first intron. Several genic G4 motifs were shown to adopt sequence-specific and potassium-dependent G4 DNA structures in vitro. The G4 motifs were prevalent in key regulatory genes associated with hypoxia (group VII ERFs), oxidative stress (DJ-1/GATase1), and energy status (AMPK/SnRK) pathways. They also showed statistical enrichment for genes in metabolic pathways that function in glycolysis, sugar degradation, inositol metabolism, and base excision repair. Collectively, the maize G4 motifs may represent conditional regulatory elements that can aid in energy status gene responses. Such a network of elements could provide a mechanistic basis for linking energy status signals to gene regulation in maize, a model genetic system and major world crop species for feed, food, and fuel.
四链体(G4)元件由四个鸟嘌呤碱基组成四联体堆叠形成的一类核酸结构。这种 G4 DNA 可以在单链 DNA 分子内或跨链形成,与双链 B 型 DNA 相互排斥。G4 的可逆性和结构多样性使其成为高度多功能的遗传结构,其在端粒代谢、基因组维持、免疫球蛋白基因多样化、转录和翻译等各种功能中发挥作用就证明了这一点。能够形成 G4 DNA 的序列基序通常位于端粒重复 DNA 和其他非端粒基因组位点。为了研究它们在一个大基因组模式植物物种中的潜在作用,我们通过计算方法在玉米(Zea mays L.,B73 AGPv2)中鉴定了 149988 个非端粒 G4 基序,其中 29%位于非重复基因组区域。G4 基序热点在反义链上的两个位置表现出非随机富集,一个在 5'UTR 中,另一个在第一个内含子的 5'端。几个基因内 G4 基序被证明在体外采用序列特异性和钾依赖性 G4 DNA 结构。G4 基序在与缺氧(VII 组 ERF)、氧化应激(DJ-1/GATase1)和能量状态(AMPK/SnRK)途径相关的关键调控基因中普遍存在。它们在参与糖酵解、糖降解、肌醇代谢和碱基切除修复的代谢途径中的基因中也表现出统计学上的富集。总的来说,玉米 G4 基序可能代表条件调节元件,可以帮助能量状态基因做出反应。这样的元素网络可以为连接玉米中能量状态信号与基因调控提供机制基础,玉米是遗传系统的模型和主要的世界粮食作物,用于饲料、食品和燃料。