Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA.
Division of Health Science and Technology, MIT, Cambridge, Massachusetts 02139, USA.
Genome Res. 2019 Jan;29(1):53-63. doi: 10.1101/gr.237636.118. Epub 2018 Dec 14.
The evolutionary history of a gene helps predict its function and relationship to phenotypic traits. Although sequence conservation is commonly used to decipher gene function and assess medical relevance, methods for functional inference from comparative expression data are lacking. Here, we use RNA-seq across seven tissues from 17 mammalian species to show that expression evolution across mammals is accurately modeled by the Ornstein-Uhlenbeck process, a commonly proposed model of continuous trait evolution. We apply this model to identify expression pathways under neutral, stabilizing, and directional selection. We further demonstrate novel applications of this model to quantify the extent of stabilizing selection on a gene's expression, parameterize the distribution of each gene's optimal expression level, and detect deleterious expression levels in expression data from individual patients. Our work provides a statistical framework for interpreting expression data across species and in disease.
基因的进化历史有助于预测其功能以及与表型特征的关系。尽管序列保守性常用于破译基因功能和评估医学相关性,但缺乏从比较表达数据推断功能的方法。在这里,我们使用来自 17 种哺乳动物的 7 种组织的 RNA-seq 数据表明,哺乳动物的表达进化可以通过奥恩斯坦-乌伦贝克过程(Ornstein-Uhlenbeck process)进行准确建模,奥恩斯坦-乌伦贝克过程是一种常用于连续性状进化的模型。我们应用该模型来识别中性选择、稳定选择和定向选择下的表达途径。我们进一步展示了该模型在量化基因表达的稳定选择程度、参数化每个基因最佳表达水平的分布以及检测个体患者表达数据中的有害表达水平方面的新应用。我们的工作为跨物种和疾病解释表达数据提供了一个统计框架。