Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics and COE in Biomathematics, University of Tehran, Iran.
BMC Genomics. 2011 May 6;12:214. doi: 10.1186/1471-2164-12-214.
Due to its overarching role in genome function, sequence-dependent DNA curvature continues to attract great attention. The DNA double helix is not a rigid cylinder, but presents both curvature and flexibility in different regions, depending on the sequence. More in depth knowledge of the various orders of complexity of genomic DNA structure has allowed the design of sophisticated bioinformatics tools for its analysis and manipulation, which, in turn, have yielded a better understanding of the genome itself. Curved DNA is involved in many biologically important processes, such as transcription initiation and termination, recombination, DNA replication, and nucleosome positioning. CpG islands and tandem repeats also play significant roles in the dynamics and evolution of genomes.
In this study, we analyzed the relationship between these three structural features within rice (Oryza sativa) and Arabidopsis (Arabidopsis thaliana) genomes. A genome-scale prediction of curvature distribution in rice and Arabidopsis indicated that most of the chromosomes of both genomes have maximal chromosomal DNA curvature adjacent to the centromeric region. By analyzing tandem repeats across the genome, we found that frequencies of repeats are higher in regions adjacent to those with high curvature value. Further analysis of CpG islands shows a clear interdependence between curvature value, repeat frequencies and CpG islands. Each CpG island appears in a local minimal curvature region, and CpG islands usually do not appear in the centromere or regions with high repeat frequency. A statistical evaluation demonstrates the significance and non-randomness of these features.
This study represents the first systematic genome-scale analysis of DNA curvature, CpG islands and tandem repeats at the DNA sequence level in plant genomes, and finds that not all of the chromosomes in plants follow the same rules common to other eukaryote organisms, suggesting that some of these genomic properties might be considered as specific to plants.
由于其在基因组功能中的总体作用,序列依赖性 DNA 曲率继续引起极大关注。DNA 双螺旋不是刚性圆柱体,而是在不同区域呈现出曲率和柔韧性,这取决于序列。对基因组 DNA 结构的各种复杂程度的更深入了解使得能够设计用于其分析和操作的复杂生物信息学工具,反过来又更好地理解了基因组本身。弯曲 DNA 参与许多生物学上重要的过程,如转录起始和终止、重组、DNA 复制和核小体定位。CpG 岛和串联重复也在基因组的动态和进化中发挥重要作用。
在这项研究中,我们分析了水稻(Oryza sativa)和拟南芥(Arabidopsis thaliana)基因组中这三种结构特征之间的关系。在水稻和拟南芥的全基因组范围内预测曲率分布表明,这两个基因组的大多数染色体在靠近着丝粒区域的位置具有最大的染色体 DNA 曲率。通过分析基因组中的串联重复,我们发现重复的频率在靠近高曲率值的区域更高。进一步分析 CpG 岛表明,曲率值、重复频率和 CpG 岛之间存在明显的相互依存关系。每个 CpG 岛出现在局部最小曲率区域,并且 CpG 岛通常不出现在着丝粒或重复频率高的区域。统计评估证明了这些特征的重要性和非随机性。
这项研究代表了植物基因组中 DNA 曲率、CpG 岛和串联重复在 DNA 序列水平上的首次系统全基因组分析,并且发现并非所有植物染色体都遵循与其他真核生物相同的规则,这表明其中一些基因组特性可能被认为是植物特有的。