Graduate School of Life and Environmental Sciences, Kyoto Prefectural University, Sakyo-ku, Kyoto, Kyoto, Japan.
Faculty of Agriculture, Setsunan University, Hirakata, Osaka, Japan.
Mol Biol Evol. 2021 Jun 25;38(7):2791-2803. doi: 10.1093/molbev/msab069.
The manner in which newborn coding sequences and their transcriptional competency emerge during the process of gene evolution remains unclear. Here, we experimentally simulated eukaryotic gene origination processes by mimicking horizontal gene transfer events in the plant genome. We mapped the precise position of the transcription start sites (TSSs) of hundreds of newly introduced promoterless firefly luciferase (LUC) coding sequences in the genome of Arabidopsis thaliana cultured cells. The systematic characterization of the LUC-TSSs revealed that 80% of them occurred under the influence of endogenous promoters, while the remainder underwent de novo activation in the intergenic regions, starting from pyrimidine-purine dinucleotides. These de novo TSSs obeyed unexpected rules; they predominantly occurred ∼100 bp upstream of the LUC inserts and did not overlap with Kozak-containing putative open reading frames (ORFs). These features were the output of the immediate responses to the sequence insertions, rather than a bias in the screening of the LUC gene function. Regarding the wild-type genic TSSs, they appeared to have evolved to lack any ORFs in their vicinities. Therefore, the repulsion by the de novo TSSs of Kozak-containing ORFs described above might be the first selection gate for the occurrence and evolution of TSSs in the plant genome. Based on these results, we characterized the de novo type of TSS identified in the plant genome and discuss its significance in genome evolution.
新生编码序列及其转录能力在基因进化过程中是如何出现的,目前尚不清楚。在这里,我们通过模拟植物基因组中的水平基因转移事件,实验模拟了真核基因起源过程。我们在拟南芥培养细胞的基因组中定位了数百个新引入的无启动子萤火虫荧光素酶(LUC)编码序列的转录起始位点(TSS)的精确位置。对 LUC-TSS 的系统表征表明,其中 80%是在内源启动子的影响下发生的,而其余的则在基因间区从头激活,起始于嘧啶-嘌呤二核苷酸。这些从头 TSS 遵循出人意料的规则;它们主要发生在 LUC 插入物上游约 100bp 处,并且与包含 Kozak 序列的潜在开放阅读框(ORF)不重叠。这些特征是对序列插入的直接反应的结果,而不是对 LUC 基因功能筛选的偏向。关于野生型基因 TSS,它们似乎已经进化到在其附近没有任何 ORF。因此,上述含有 Kozak 的 ORF 与从头 TSS 的排斥可能是植物基因组中 TSS 发生和进化的第一个选择门。基于这些结果,我们对植物基因组中鉴定出的从头 TSS 进行了特征描述,并讨论了其在基因组进化中的意义。