Lemos Bernardo, Bettencourt Brian R, Meiklejohn Colin D, Hartl Daniel L
Department of Organismic and Evolutionary Biology, Harvard University, USA.
Mol Biol Evol. 2005 May;22(5):1345-54. doi: 10.1093/molbev/msi122. Epub 2005 Mar 2.
Organismic evolution requires that variation at distinct hierarchical levels and attributes be coherently integrated, often in the face of disparate environmental and genetic pressures. A central part of the evolutionary analysis of biological systems remains to decipher the causal connections between organism-wide (or genome-wide) attributes (e.g., mRNA abundance, protein length, codon bias, recombination rate, genomic position, mutation rate, etc) as well as their role-together with mutation, selection, and genetic drift-in shaping patterns of evolutionary variation in any of the attributes themselves. Here we combine genome-wide evolutionary analysis of protein and gene expression data to highlight fundamental relationships among genomic attributes and their associations with the evolution of both protein sequences and gene expression levels. Our results show that protein divergence is positively coupled with both gene expression polymorphism and divergence. We show moreover that although the number of protein-protein interactions in Drosophila is negatively associated with protein divergence as well as gene expression polymorphism and divergence, protein-protein interactions cannot account for the observed coupling between regulatory and structural evolution. Furthermore, we show that proteins with higher rates of amino acid substitutions tend to have larger sizes and tend to be expressed at lower mRNA abundances, whereas genes with higher levels of gene expression divergence and polymorphism tend to have shorter sizes and tend to be expressed at higher mRNA abundances. Finally, we show that protein length is negatively associated with both number of protein-protein interactions and mRNA abundance and that interacting proteins in Drosophila show similar amounts of divergence. We suggest that protein sequences and gene expression are subjected to similar evolutionary dynamics, possibly because of similarity in the fitness effect (i.e., strength of stabilizing selection) of disruptions in a gene's protein sequence or its mRNA expression. We conclude that, as more and better data accumulate, understanding the causal connections among biological traits and how they are integrated over time to constrain or promote structural and regulatory evolution may finally become possible.
生物进化要求在不同层次水平和属性上的变异能够连贯地整合起来,通常是在面对不同的环境和遗传压力时。生物系统进化分析的一个核心部分仍然是解读全生物体(或全基因组)属性之间的因果联系(例如,mRNA丰度、蛋白质长度、密码子偏好、重组率、基因组位置、突变率等),以及它们与突变、选择和遗传漂变一起在塑造任何属性自身的进化变异模式中的作用。在这里,我们结合对蛋白质和基因表达数据的全基因组进化分析,以突出基因组属性之间的基本关系以及它们与蛋白质序列进化和基因表达水平的关联。我们的结果表明,蛋白质分化与基因表达多态性和分化均呈正相关。我们还表明,尽管果蝇中蛋白质-蛋白质相互作用的数量与蛋白质分化以及基因表达多态性和分化呈负相关,但蛋白质-蛋白质相互作用并不能解释所观察到的调控进化与结构进化之间的耦合。此外,我们表明,氨基酸替换率较高的蛋白质往往具有更大的尺寸,并且往往以较低的mRNA丰度表达,而基因表达分化和多态性水平较高的基因往往具有较短的尺寸,并且往往以较高的mRNA丰度表达。最后,我们表明蛋白质长度与蛋白质-蛋白质相互作用的数量和mRNA丰度均呈负相关,并且果蝇中的相互作用蛋白质表现出相似程度的分化。我们认为,蛋白质序列和基因表达可能受到相似的进化动态影响,这可能是因为基因的蛋白质序列或其mRNA表达受到破坏时,在适应度效应(即稳定选择的强度)方面具有相似性。我们得出结论,随着越来越多更好的数据积累,最终有可能理解生物学性状之间的因果联系以及它们如何随时间整合以限制或促进结构和调控进化。