State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, 455000, Henan, China.
Gembloux Agro-Bio Tech, University of Liège, 5030, Gembloux, Belgium.
BMC Plant Biol. 2018 Nov 28;18(1):307. doi: 10.1186/s12870-018-1519-7.
Fluorescence in situ hybridization (FISH) is an efficient cytogenetic technology to study chromosome structure. Transposable element (TE) is an important component in eukaryotic genomes and can provide insights in the structure and evolution of eukaryotic genomes.
A FISH probe derived from bacterial artificial chromosome (BAC) clone 299N22 generated striking signals on all 26 chromosomes of the cotton diploid A genome (AA, 2x=26) but very few on the diploid D genome (DD, 2x=26). All 26 chromosomes of the A sub genome (At) of tetraploid cotton (AADD, 2n=4x=52) also gave positive signals with this FISH probe, whereas very few signals were observed on the D sub genome (Dt). Sequencing and annotation of BAC clone 299N22, revealed a novel Ty3/gypsy transposon family, which was named as 'CICR'. This family is a significant contributor to size expansion in the A (sub) genome but not in the D (sub) genome. Further FISH analysis with the LTR of CICR as a probe revealed that CICR is lineage-specific, since massive repeats were found in A and B genomic groups, but not in C-G genomic groups within the Gossypium genus. Molecular evolutionary analysis of CICR suggested that tetraploid cottons evolved after silence of the transposon family 1-1.5 million years ago (Mya). Furthermore, A genomes are more homologous with B genomes, and the C, E, F, and G genomes likely diverged from a common ancestor prior to 3.5-4 Mya, the time when CICR appeared. The genomic variation caused by the insertion of CICR in the A (sub) genome may have played an important role in the speciation of organisms with A genomes.
The CICR family is highly repetitive in A and B genomes of Gossypium, but not amplified in the C-G genomes. The differential amount of CICR family in At and Dt will aid in partitioning sub genome sequences for chromosome assemblies during tetraploid genome sequencing and will act as a method for assessing the accuracy of tetraploid genomes by looking at the proportion of CICR elements in resulting pseudochromosome sequences. The timeline of the expansion of CICR family provides a new reference for cotton evolutionary analysis, while the impact on gene function caused by the insertion of CICR elements will be a target for further analysis of investigating phenotypic differences between A genome and D genome species.
荧光原位杂交(FISH)是一种研究染色体结构的有效细胞遗传学技术。转座元件(TE)是真核基因组的重要组成部分,可以深入了解真核基因组的结构和进化。
从细菌人工染色体(BAC)克隆 299N22 衍生的 FISH 探针在棉花二倍体 A 基因组(AA,2x=26)的所有 26 条染色体上产生了显著的信号,但在二倍体 D 基因组(DD,2x=26)上产生的信号非常少。四倍体棉花(AADD,2n=4x=52)的 A 亚基因组(At)的所有 26 条染色体也对该 FISH 探针呈阳性信号,而在 D 亚基因组(Dt)上观察到的信号非常少。BAC 克隆 299N22 的测序和注释揭示了一个新的 Ty3/gypsy 转座子家族,被命名为“CICR”。该家族是 A(亚)基因组大小扩张的重要贡献者,但不是 D(亚)基因组的大小扩张的重要贡献者。用 CICR 的 LTR 作为探针进行进一步的 FISH 分析表明,CICR 是谱系特异性的,因为在 A 和 B 基因组群中发现了大量重复序列,但在棉属的 C-G 基因组群中没有发现。CICR 的分子进化分析表明,四倍体棉花在 100 万至 150 万年前(Mya)沉默转座子家族后进化而来。此外,A 基因组与 B 基因组更同源,C、E、F 和 G 基因组可能在 350-400 万年前 CICR 出现之前就从共同祖先分化而来。CICR 在 A(亚)基因组中的插入引起的基因组变异可能在具有 A 基因组的生物的物种形成中发挥了重要作用。
CICR 家族在棉属的 A 和 B 基因组中高度重复,但在 C-G 基因组中没有扩增。At 和 Dt 中 CICR 家族的不同数量将有助于在四倍体基因组测序时对亚基因组序列进行分区,并且可以作为一种方法,通过观察 CICR 元件在产生的假染色体序列中的比例来评估四倍体基因组的准确性。CICR 家族扩张的时间线为棉花进化分析提供了一个新的参考,而 CICR 元件插入对基因功能的影响将是进一步分析 A 基因组和 D 基因组物种之间表型差异的目标。