Lin Haining, Zhu Wei, Silva Joana C, Gu Xun, Buell C Robin
The Institute for Genomic Research, Medical Center Drive, Rockville, MD 20850, USA.
Genome Biol. 2006;7(5):R41. doi: 10.1186/gb-2006-7-5-r41. Epub 2006 May 23.
Introns are under less selection pressure than exons, and consequently, intronic sequences have a higher rate of gain and loss than exons. In a number of plant species, a large portion of the genome has been segmentally duplicated, giving rise to a large set of duplicated genes. The recent completion of the rice genome in which segmental duplication has been documented has allowed us to investigate intron evolution within rice, a diploid monocotyledonous species.
Analysis of segmental duplication in rice revealed that 159 Mb of the 371 Mb genome and 21,570 of the 43,719 non-transposable element-related genes were contained within a duplicated region. In these duplicated regions, 3,101 collinear paired genes were present. Using this set of segmentally duplicated genes, we investigated intron evolution from full-length cDNA-supported non-transposable element-related gene models of rice. Using gene pairs that have an ortholog in the dicotyledonous model species Arabidopsis thaliana, we identified more intron loss (49 introns within 35 gene pairs) than intron gain (5 introns within 5 gene pairs) following segmental duplication. We were unable to demonstrate preferential intron loss at the 3' end of genes as previously reported in mammalian genomes. However, we did find that the four nucleotides of exons that flank lost introns had less frequently used 4-mers.
We observed that intron evolution within rice following segmental duplication is largely dominated by intron loss. In two of the five cases of intron gain within segmentally duplicated genes, the gained sequences were similar to transposable elements.
内含子所受的选择压力比外显子小,因此,内含子序列的获得和丢失速率比外显子高。在许多植物物种中,基因组的很大一部分发生了片段重复,产生了大量的重复基因。水稻基因组测序的完成记录了片段重复,这使我们能够研究二倍体单子叶植物水稻的内含子进化。
对水稻片段重复的分析表明,在371 Mb的基因组中,159 Mb以及43,719个与非转座元件相关的基因中的21,570个包含在一个重复区域内。在这些重复区域中,存在3,101个共线配对基因。利用这组片段重复基因,我们从水稻全长cDNA支持的非转座元件相关基因模型研究了内含子进化。利用在双子叶模式植物拟南芥中有直系同源基因的基因对,我们发现在片段重复后,内含子丢失(35个基因对中的49个内含子)比内含子获得(5个基因对中的5个内含子)更多。我们无法证明如先前在哺乳动物基因组中报道的那样,基因3'端存在优先内含子丢失。然而,我们确实发现,丢失内含子两侧外显子的四个核苷酸中,四联体的使用频率较低。
我们观察到,水稻片段重复后的内含子进化在很大程度上以内含子丢失为主。在片段重复基因的五个内含子获得案例中,有两个案例中获得的序列与转座元件相似。