Crop Improvement and Genetics Research Unit, Western Regional Research Center, U.S. Department of Agriculture - Agricultural Research Service, 800 Buchanan St, Albany, CA, 94710, USA.
Montana BioAg Inc., Missoula, MT, USA.
Sci Rep. 2021 Apr 12;11(1):7876. doi: 10.1038/s41598-021-86838-3.
G-quadruplexes (G4s) are four-stranded nucleic acid structures with closely spaced guanine bases forming square planar G-quartets. Aberrant formation of G4 structures has been associated with genomic instability. However, most plant species are lacking comprehensive studies of G4 motifs. In this study, genome-wide identification of G4 motifs in barley was performed, followed by a comparison of genomic distribution and molecular functions to other monocot species, such as wheat, maize, and rice. Similar to the reports on human and some plants like wheat, G4 motifs peaked around the 5' untranslated region (5' UTR), the first coding domain sequence, and the first intron start sites on antisense strands. Our comparative analyses in human, Arabidopsis, maize, rice, and sorghum demonstrated that the peak points could be erroneously merged into a single peak when large window sizes are used. We also showed that the G4 distributions around genic regions are relatively similar in the species studied, except in the case of Arabidopsis. G4 containing genes in monocots showed conserved molecular functions for transcription initiation and hydrolase activity. Additionally, we provided examples of imperfect G4 motifs.
四链体(G4s)是具有紧密间隔的鸟嘌呤碱基形成正方形平面 G-四联体的四条链核酸结构。异常形成 G4 结构与基因组不稳定性有关。然而,大多数植物物种缺乏对 G4 基序的全面研究。在这项研究中,对大麦中的 G4 基序进行了全基因组鉴定,然后将基因组分布和分子功能与其他单子叶植物(如小麦、玉米和水稻)进行了比较。与人类和一些植物(如小麦)的报告类似,G4 基序在反义链上的 5'非翻译区(5'UTR)、第一个编码结构域序列和第一个内含子起始位点附近达到峰值。我们在人类、拟南芥、玉米、水稻和高粱中的比较分析表明,当使用较大的窗口大小时,峰值可能会错误地合并为一个单一的峰值。我们还表明,在所研究的物种中,基因周围的 G4 分布相对相似,除了拟南芥的情况。单子叶植物中含有 G4 的基因表现出转录起始和水解酶活性的保守分子功能。此外,我们提供了不完美 G4 基序的示例。