Department of Biotechnology, Indian Institute of Technology, Kharagpur, 721302, India.
Present address: Gregor Mendel Institute, Dr. Bohr-gasse 3, Vienna, 1030, Austria.
BMC Genomics. 2018 Feb 21;19(1):156. doi: 10.1186/s12864-018-4494-3.
The repetitive content of the genome, once considered to be "junk DNA", is in fact an essential component of genomic architecture and evolution. In this study, we used the genomes of three varieties of Cannabis sativa, three varieties of Humulus lupulus and one genotype of Morus notabilis to explore their repetitive content using a graph-based clustering method, designed to explore and compare repeat content in genomes that have not been fully assembled.
The repetitive content in the C. sativa genome is mainly composed of the retrotransposons LTR/Copia and LTR/Gypsy (14% and 14.8%, respectively), ribosomal DNA (2%), and low-complexity sequences (29%). We observed a recent copy number expansion in some transposable element families. Simple repeats and low complexity regions of the genome show higher intra and inter species variation.
As with other sequenced genomes, the repetitive content of C. sativa's genome exhibits a wide range of evolutionary patterns. Some repeat types have patterns of diversity consistent with expansions followed by losses in copy number, while others may have expanded more slowly and reached a steady state. Still, other repetitive sequences, particularly ribosomal DNA (rDNA), show signs of concerted evolution playing a major role in homogenizing sequence variation.
基因组中的重复内容曾被认为是“垃圾 DNA”,但实际上是基因组结构和进化的重要组成部分。在这项研究中,我们使用了三种大麻品种、三种葎草品种和一种乌桕基因型的基因组,采用基于图的聚类方法来探索它们的重复内容,该方法旨在探索和比较尚未完全组装的基因组中的重复内容。
大麻基因组中的重复内容主要由逆转录转座子 LTR/Copia 和 LTR/Gypsy(分别为 14%和 14.8%)、核糖体 DNA(2%)和低复杂度序列(29%)组成。我们观察到一些转座元件家族最近发生了拷贝数扩张。基因组中的简单重复和低复杂度区域显示出更高的种内和种间变异。
与其他已测序的基因组一样,大麻基因组的重复内容表现出广泛的进化模式。一些重复类型的多样性模式与拷贝数的扩张后丢失一致,而其他类型的重复可能扩张得更慢,达到了稳定状态。不过,其他重复序列,特别是核糖体 DNA(rDNA),显示出协同进化的迹象,在同质化序列变异中发挥了主要作用。