Zhao Yi, Tang Liang, Li Zhe, Jin Jinpu, Luo Jingchu, Gao Ge
State Key Laboratory of Protein and Plant Gene Research, College of Life Science, Center for Bioinformatics, Peking University, Beijing, 100871, People's Republic of China.
Current address: College of Horticulture and Landscape Architecture, Southwest University, Chongqing, 400715, People's Republic of China.
BMC Evol Biol. 2015 Apr 18;15:66. doi: 10.1186/s12862-015-0345-x.
Long-established protein-coding genes may lose their coding potential during evolution ("unitary gene loss"). Members of the Poaceae family are a major food source and represent an ideal model clade for plant evolution research. However, the global pattern of unitary gene loss in Poaceae genomes as well as the evolutionary fate of lost genes are still less-investigated and remain largely elusive.
Using a locally developed pipeline, we identified 129 unitary gene loss events for long-established protein-coding genes from four representative species of Poaceae, i.e. brachypodium, rice, sorghum and maize. Functional annotation suggested that the lost genes in all or most of Poaceae species are enriched for genes involved in development and response to endogenous stimulus. We also found that 44 mutated genomic loci of lost genes, which we referred as relics, were still actively transcribed, and of which 84% (37 of 44) showed significantly differential expression across different tissues. More interestingly, we found that there were totally five expressed relics may function as competitive endogenous RNA in brachypodium, rice and sorghum genome.
Based on comparative genomics and transcriptome data, we firstly compiled a comprehensive catalogue of unitary gene loss events in Poaceae species and characterized a statistically significant functional preference for these lost genes as well showed the potential of relics functioning as competitive endogenous RNAs in Poaceae genomes.
长期存在的蛋白质编码基因在进化过程中可能会失去其编码潜力(“单一基因丢失”)。禾本科植物是主要的食物来源,是植物进化研究的理想模型分支。然而,禾本科基因组中单一基因丢失的全球模式以及丢失基因的进化命运仍未得到充分研究,在很大程度上仍然难以捉摸。
使用本地开发的流程,我们从禾本科的四个代表性物种,即短柄草、水稻、高粱和玉米中,鉴定出129个长期存在的蛋白质编码基因的单一基因丢失事件。功能注释表明,在所有或大多数禾本科物种中丢失的基因富含参与发育和对内源刺激反应的基因。我们还发现,丢失基因的44个突变基因组位点(我们称之为遗迹)仍在活跃转录,其中84%(44个中的37个)在不同组织中表现出显著的差异表达。更有趣的是,我们发现在短柄草、水稻和高粱基因组中共有五个表达的遗迹可能作为竞争性内源RNA发挥作用。
基于比较基因组学和转录组数据,我们首次编制了禾本科物种单一基因丢失事件的综合目录,对这些丢失基因的统计学显著功能偏好进行了表征,并展示了遗迹在禾本科基因组中作为竞争性内源RNA发挥作用的潜力。