School of Biology, Georgia Institute of Technology, GA, USA.
Genome Biol Evol. 2011;3:259-71. doi: 10.1093/gbe/evr015. Epub 2011 Feb 28.
Independent lines of investigation have documented effects of both transposable elements (TEs) and gene length (GL) on gene expression. However, TE gene fractions are highly correlated with GL, suggesting that they cannot be considered independently. We evaluated the TE environment of human genes and GL jointly in an attempt to tease apart their relative effects. TE gene fractions and GL were compared with the overall level of gene expression and the breadth of expression across tissues. GL is strongly correlated with overall expression level but weakly correlated with the breadth of expression, confirming the selection hypothesis that attributes the compactness of highly expressed genes to selection for economy of transcription. However, TE gene fractions overall, and for the L1 family in particular, show stronger anticorrelations with expression level than GL, indicating that GL may not be the most important target of selection for transcriptional economy. These results suggest a specific mechanism, removal of TEs, by which highly expressed genes are selectively tuned for efficiency. MIR elements are the only family of TEs with gene fractions that show a positive correlation with tissue-specific expression, suggesting that they may provide regulatory sequences that help to control human gene expression. Consistent with this notion, MIR fractions are relatively enriched close to transcription start sites and associated with coexpression in specific sets of related tissues. Our results confirm the overall relevance of the TE environment to gene expression and point to distinct mechanisms by which different TE families may contribute to gene regulation.
独立的研究表明,转座元件(TEs)和基因长度(GL)都对基因表达有影响。然而,TE 基因分数与 GL 高度相关,这表明它们不能被独立考虑。我们评估了人类基因的 TE 环境和 GL,试图分离它们的相对影响。比较了 TE 基因分数和 GL 与整体基因表达水平和组织表达的广度。GL 与整体表达水平呈强相关,但与表达广度呈弱相关,证实了选择假说,即高度表达基因的紧凑性归因于对转录经济性的选择。然而,TE 基因分数总体上,特别是 L1 家族,与表达水平的负相关性强于 GL,表明 GL 可能不是转录经济性选择的最重要目标。这些结果表明了一种特定的机制,即通过去除 TEs,高度表达的基因被选择性地调整以提高效率。MIR 元件是唯一与组织特异性表达呈正相关的 TE 家族,表明它们可能提供了有助于控制人类基因表达的调节序列。与这一观点一致的是,MIR 分数在转录起始位点附近相对富集,并与特定相关组织的共表达相关。我们的结果证实了 TE 环境对基因表达的总体相关性,并指出了不同 TE 家族可能对基因调控产生影响的不同机制。