Li Weiyi, Baehr Stephan, Marasco Michelle, Reyes Lauren, Brister Danielle, Pikaard Craig S, Gout Jean-Francois, Vermulst Marc, Lynch Michael
Department of Genetics, Stanford University School of Medicine, Stanford University, Stanford, CA, 94305.
Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85287.
bioRxiv. 2025 Jan 14:2023.05.02.538944. doi: 10.1101/2023.05.02.538944.
The expression of genomically-encoded information is not error-free. Transcript-error rates are dramatically higher than DNA-level mutation rates, and despite their transient nature, the steady-state load of such errors must impose some burden on cellular performance. However, a broad perspective on the degree to which transcript-error rates are constrained by natural selection and diverge among lineages remains to be developed. Here, we present a genome-wide analysis of transcript-error rates across the Tree of Life using a modified rolling-circle sequencing method, revealing that the range in error rates is remarkably narrow across diverse species. Transcript errors tend to be randomly distributed, with little evidence supporting local control of error rates associated with gene-expression levels. A majority of transcript errors result in missense errors if translated, and as with a fraction of nonsense transcript errors, these are underrepresented relative to random expectations, suggesting the existence of mechanisms for purging some such errors. To quantitatively understand how natural selection and random genetic drift might shape transcript-error rates across species, we present a model based on cell biology and population genetics, incorporating information on cell volume, proteome size, average degree of exposure of individual errors, and effective population size. However, while this model provides a framework for understanding the evolution of this highly conserved trait, as currently structured it explains only 20% of the variation in the data, suggesting a need for further theoretical work in this area.
基因组编码信息的表达并非毫无差错。转录错误率显著高于DNA水平的突变率,尽管这些错误具有瞬时性,但这种错误的稳态负荷必然会对细胞性能造成一定负担。然而,对于转录错误率在多大程度上受到自然选择的限制以及在不同谱系间的差异,仍有待深入研究。在此,我们使用改良的滚环测序方法对整个生命之树的转录错误率进行了全基因组分析,结果表明不同物种间的错误率范围非常狭窄。转录错误往往呈随机分布,几乎没有证据支持与基因表达水平相关的错误率存在局部调控。如果进行翻译,大多数转录错误会导致错义错误,并且与一部分无义转录错误一样,相对于随机预期,这些错误的比例较低,这表明存在清除某些此类错误的机制。为了定量理解自然选择和随机遗传漂变如何塑造不同物种间的转录错误率,我们提出了一个基于细胞生物学和群体遗传学的模型,该模型纳入了细胞体积、蛋白质组大小、单个错误的平均暴露程度以及有效种群大小等信息。然而,尽管这个模型为理解这一高度保守性状的进化提供了一个框架,但按照目前的结构,它仅解释了数据中20%的变异,这表明该领域需要进一步开展理论研究。