National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
Genome Biol Evol. 2009 Sep 22;1:382-90. doi: 10.1093/gbe/evp038.
Analysis of gene architecture and expression levels of four organisms, Homo sapiens, Caenorhabditis elegans, Drosophila melanogaster, and Arabidopsis thaliana, reveals a surprising, nonmonotonic, universal relationship between expression level and gene compactness. With increasing expression level, the genes tend at first to become longer but, from a certain level of expression, they become more and more compact, resulting in an approximate bell-shaped dependence. There are two leading hypotheses to explain the compactness of highly expressed genes. The selection hypothesis predicts that gene compactness is predominantly driven by the level of expression, whereas the genomic design hypothesis predicts that expression breadth across tissues is the driving force. We observed the connection between gene expression breadth in humans and gene compactness to be significantly weaker than the connection between expression level and compactness, a result that is compatible with the selection hypothesis but not the genome design hypothesis. The initial gene elongation with increasing expression level could be explained, at least in part, by accumulation of regulatory elements enhancing expression, in particular, in introns. This explanation is compatible with the observed positive correlation between intron density and expression level of a gene. Conversely, the trend toward increasing compactness for highly expressed genes could be caused by selection for minimization of energy and time expenditure during transcription and splicing and for increased fidelity of transcription, splicing, and/or translation that is likely to be particularly critical for highly expressed genes. Regardless of the exact nature of the forces that shape the gene architecture, we present evidence that, at least, in animals, coding and noncoding parts of genes show similar architectonic trends.
对四种生物(人类、秀丽隐杆线虫、黑腹果蝇和拟南芥)的基因结构和表达水平进行分析,揭示了一个令人惊讶的、非单调的、普遍存在的表达水平与基因紧凑性之间的关系。随着表达水平的增加,基因最初往往会变得更长,但从某个表达水平开始,它们会变得越来越紧凑,导致近似钟形的依赖关系。有两个主要的假说可以解释高表达基因的紧凑性。选择假说预测基因紧凑性主要受表达水平驱动,而基因组设计假说预测组织间的表达广度是驱动力。我们观察到人类基因表达广度与基因紧凑性之间的联系明显弱于表达水平与紧凑性之间的联系,这一结果与选择假说一致,但与基因组设计假说不一致。随着表达水平的增加,基因最初的延长可以部分解释为增强表达的调控元件的积累,特别是在内含子中。这一解释与观察到的基因内含子密度与表达水平之间的正相关是一致的。相反,高度表达基因的紧凑性增加趋势可能是由于转录和剪接过程中能量和时间消耗最小化以及转录、剪接和/或翻译保真度提高的选择所致,这对于高度表达的基因可能尤为关键。无论塑造基因结构的确切力量的性质如何,我们都提供了证据表明,至少在动物中,编码和非编码部分的基因表现出相似的结构趋势。