Zeng Ping, Zhou Xiang
Department of Epidemiology and Biostatistics, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China.
Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
Nat Commun. 2017 Sep 6;8(1):456. doi: 10.1038/s41467-017-00470-2.
Using genotype data to perform accurate genetic prediction of complex traits can facilitate genomic selection in animal and plant breeding programs, and can aid in the development of personalized medicine in humans. Because most complex traits have a polygenic architecture, accurate genetic prediction often requires modeling all genetic variants together via polygenic methods. Here, we develop such a polygenic method, which we refer to as the latent Dirichlet process regression model. Dirichlet process regression is non-parametric in nature, relies on the Dirichlet process to flexibly and adaptively model the effect size distribution, and thus enjoys robust prediction performance across a broad spectrum of genetic architectures. We compare Dirichlet process regression with several commonly used prediction methods with simulations. We further apply Dirichlet process regression to predict gene expressions, to conduct PrediXcan based gene set test, to perform genomic selection of four traits in two species, and to predict eight complex traits in a human cohort.Genetic prediction of complex traits with polygenic architecture has wide application from animal breeding to disease prevention. Here, Zeng and Zhou develop a non-parametric genetic prediction method based on latent Dirichlet Process regression models.
利用基因型数据对复杂性状进行准确的遗传预测,有助于动植物育种计划中的基因组选择,并有助于人类个性化医疗的发展。由于大多数复杂性状具有多基因结构,准确的遗传预测通常需要通过多基因方法对所有遗传变异进行联合建模。在此,我们开发了这样一种多基因方法,我们将其称为潜在狄利克雷过程回归模型。狄利克雷过程回归本质上是非参数的,它依赖狄利克雷过程来灵活且自适应地对效应大小分布进行建模,因此在广泛的遗传结构中都具有稳健的预测性能。我们通过模拟将狄利克雷过程回归与几种常用的预测方法进行比较。我们进一步应用狄利克雷过程回归来预测基因表达、进行基于PrediXcan的基因集测试、对两个物种的四个性状进行基因组选择,以及预测人类队列中的八个复杂性状。具有多基因结构的复杂性状的遗传预测在从动物育种到疾病预防等方面有着广泛应用。在此,曾和周开发了一种基于潜在狄利克雷过程回归模型的非参数遗传预测方法。