Rody Hugo V S, Baute Gregory J, Rieseberg Loren H, Oliveira Luiz O
Department of Biochemistry and Molecular Biology, Universidade Federal de Viçosa, Viçosa, 36570-900, Minas Gerais, Brazil.
Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.
BMC Genomics. 2017 Jan 6;18(1):46. doi: 10.1186/s12864-016-3423-6.
All extant seed plants are successful paleopolyploids, whose genomes carry duplicate genes that have survived repeated episodes of diploidization. However, the survival of gene duplicates is biased with respect to gene function and mechanism of duplication. Transcription factors, in particular, are reported to be preferentially retained following whole-genome duplications (WGDs), but disproportionately lost when duplicated by tandem events. An explanation for this pattern is provided by the Gene Balance Hypothesis (GBH), which posits that duplicates of highly connected genes are retained following WGDs to maintain optimal stoichiometry among gene products; but such connected gene duplicates are disfavored following tandem duplications.
We used genomic data from 25 taxonomically diverse plant species to investigate the roles of duplication mechanism, gene function, and age of duplication in the retention of duplicate genes. Enrichment analyses were conducted to identify Gene Ontology (GO) functional categories that were overrepresented in either WGD or tandem duplications, or across ranges of divergence times. Tandem paralogs were much younger, on average, than WGD paralogs and the most frequently overrepresented GO categories were not shared between tandem and WGD paralogs. Transcription factors were overrepresented among ancient paralogs regardless of mechanism of origin or presence of a WGD. Also, in many cases, there was no bias toward transcription factor retention following recent WGDs.
Both the fixation and the retention of duplicated genes in plant genomes are context-dependent events. The strong bias toward ancient transcription factor duplicates can be reconciled with the GBH if selection for optimal stoichiometry among gene products is strongest following the earliest polyploidization events and becomes increasingly relaxed as gene families expand.
所有现存的种子植物都是成功的古多倍体,其基因组携带的重复基因在多次二倍体化事件中得以保留。然而,基因重复的保留在基因功能和重复机制方面存在偏向性。据报道,转录因子在全基因组复制(WGD)后优先保留,但在串联事件导致基因重复时则不成比例地丢失。基因平衡假说(GBH)为这种模式提供了解释,该假说认为,高度连接基因的重复在WGD后得以保留,以维持基因产物之间的最佳化学计量;但在串联重复后,这种连接的基因重复则不受青睐。
我们使用来自25个分类学上不同的植物物种的基因组数据,研究重复机制、基因功能和重复年龄在重复基因保留中的作用。进行富集分析以确定在WGD或串联重复中,或在不同分化时间范围内过度代表的基因本体(GO)功能类别。平均而言,串联旁系同源基因比WGD旁系同源基因年轻得多,并且串联和WGD旁系同源基因之间最常过度代表的GO类别并不相同。无论起源机制或WGD的存在如何,转录因子在古老的旁系同源基因中都过度代表。此外,在许多情况下,近期WGD后转录因子的保留没有偏向性。
植物基因组中重复基因的固定和保留都是依赖于背景的事件。如果在最早的多倍体化事件后,对基因产物之间最佳化学计量的选择最强,并且随着基因家族的扩大而越来越宽松,那么对古老转录因子重复的强烈偏向性可以与GBH相协调。