Suppr超能文献

拟南芥和水稻蛋白质编码基因中核苷酸组成的内含子结构变异模式

Introns Structure Patterns of Variation in Nucleotide Composition in Arabidopsis thaliana and Rice Protein-Coding Genes.

作者信息

Ressayre Adrienne, Glémin Sylvain, Montalent Pierre, Serre-Giardi Laurana, Dillmann Christine, Joets Johann

机构信息

UMR 0320/UMR 8120 Génétique Quantitative et Evolution-Le Moulon, INRA, Gif-sur-Yvette, France

Institut des Sciences de l'Evolution (ISEM), UMR 5554, Université de Montpellier, CNRS-IRD-EPHE, France Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Sweden.

出版信息

Genome Biol Evol. 2015 Oct 7;7(10):2913-28. doi: 10.1093/gbe/evv189.

Abstract

Plant genomes present a continuous range of variation in nucleotide composition (G + C content). In coding regions, G + C-poor species tend to have unimodal distributions of G + C content among genes within genomes and slight 5'-3' gradients along genes. In contrast, G + C-rich species display bimodal distributions of G + C content among genes and steep 5'-3' decreasing gradients along genes. The causes of these peculiar patterns are still poorly understood. Within two species (Arabidopsis thaliana and rice), each representative of one side of the continuum, we studied the consequences of intron presence on coding region and intron G + C content at different scales. By properly taking intron structure into account, we showed that, in both species, intron presence is associated with step changes in nucleotide, codon, and amino acid composition. This suggests that introns have a barrier effect structuring G + C content along genes and that previous continuous characterizations of the 5'-3' gradients were artifactual. In external gene regions (located upstream first or downstream last introns), species-specific factors, such as GC-biased gene conversion, are shaping G + C content whereas in internal gene regions (surrounded by introns), G + C content is likely constrained to remain within a range common to both species.

摘要

植物基因组在核苷酸组成(G + C含量)上呈现出连续的变异范围。在编码区域,G + C含量低的物种往往在基因组内的基因间具有单峰的G + C含量分布,并且沿着基因有轻微的5'-3'梯度。相反,G + C含量高的物种在基因间显示出双峰的G + C含量分布,并且沿着基因有陡峭的5'-3'递减梯度。这些特殊模式的原因仍然知之甚少。在连续体两侧各有一个代表物种(拟南芥和水稻)中,我们研究了内含子的存在在不同尺度上对编码区和内含子G + C含量的影响。通过适当考虑内含子结构,我们表明,在这两个物种中,内含子的存在与核苷酸、密码子和氨基酸组成的阶跃变化相关。这表明内含子对基因沿线的G + C含量具有屏障作用,并且先前对5'-3'梯度的连续表征是人为的。在外部基因区域(位于第一个内含子上游或最后一个内含子下游),物种特异性因素,如GC偏向的基因转换,正在塑造G + C含量,而在内部基因区域(被内含子包围),G + C含量可能被限制在两个物种共有的范围内。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30d3/4684703/451582328846/evv189f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验