Suppr超能文献

非中性多等位基因模型的高效模拟和似然方法。

Efficient simulation and likelihood methods for non-neutral multi-allele models.

作者信息

Joyce Paul, Genz Alan, Buzbas Erkan Ozge

机构信息

Department of Mathematics and Initiative for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, ID, USA.

出版信息

J Comput Biol. 2012 Jun;19(6):650-61. doi: 10.1089/cmb.2012.0033.

Abstract

Throughout the 1980s, Simon Tavaré made numerous significant contributions to population genetics theory. As genetic data, in particular DNA sequence, became more readily available, a need to connect population-genetic models to data became the central issue. The seminal work of Griffiths and Tavaré (1994a , 1994b , 1994c) was among the first to develop a likelihood method to estimate the population-genetic parameters using full DNA sequences. Now, we are in the genomics era where methods need to scale-up to handle massive data sets, and Tavaré has led the way to new approaches. However, performing statistical inference under non-neutral models has proved elusive. In tribute to Simon Tavaré, we present an article in spirit of his work that provides a computationally tractable method for simulating and analyzing data under a class of non-neutral population-genetic models. Computational methods for approximating likelihood functions and generating samples under a class of allele-frequency based non-neutral parent-independent mutation models were proposed by Donnelly, Nordborg, and Joyce (DNJ) (Donnelly et al., 2001). DNJ (2001) simulated samples of allele frequencies from non-neutral models using neutral models as auxiliary distribution in a rejection algorithm. However, patterns of allele frequencies produced by neutral models are dissimilar to patterns of allele frequencies produced by non-neutral models, making the rejection method inefficient. For example, in some cases the methods in DNJ (2001) require 10(9) rejections before a sample from the non-neutral model is accepted. Our method simulates samples directly from the distribution of non-neutral models, making simulation methods a practical tool to study the behavior of the likelihood and to perform inference on the strength of selection.

摘要

在整个20世纪80年代,西蒙·塔瓦雷对群体遗传学理论做出了众多重大贡献。随着遗传数据,尤其是DNA序列变得更容易获取,将群体遗传模型与数据联系起来的需求成为核心问题。格里菲思和塔瓦雷(1994a、1994b、1994c)的开创性工作是最早开发出一种使用完整DNA序列来估计群体遗传参数的似然方法的研究之一。如今,我们处于基因组学时代,方法需要扩大规模以处理海量数据集,而塔瓦雷引领了新方法的发展方向。然而,在非中性模型下进行统计推断已被证明是难以捉摸的。为了向西蒙·塔瓦雷致敬,我们发表一篇秉承他的工作精神的文章,该文章提供了一种计算上易于处理的方法,用于在一类非中性群体遗传模型下模拟和分析数据。唐纳利、诺德伯格和乔伊斯(DNJ)(唐纳利等人,2001年)提出了在一类基于等位基因频率的非中性亲本独立突变模型下近似似然函数和生成样本的计算方法。DNJ(2001年)在一种拒绝算法中使用中性模型作为辅助分布,从非中性模型模拟等位基因频率样本。然而,中性模型产生的等位基因频率模式与非中性模型产生的等位基因频率模式不同,这使得拒绝方法效率低下。例如,在某些情况下,DNJ(2001年)中的方法在接受一个来自非中性模型的样本之前需要10^9次拒绝。我们的方法直接从非中性模型的分布模拟样本,使模拟方法成为研究似然行为和对选择强度进行推断的实用工具。

相似文献

2
Importance sampling for the infinite sites model.无限位点模型的重要性抽样
Stat Appl Genet Mol Biol. 2008;7(1):Article32. doi: 10.2202/1544-6115.1400. Epub 2008 Oct 30.
6
Estimation in an island model using simulation.使用模拟方法在岛屿模型中的估计。
Theor Popul Biol. 1996 Dec;50(3):227-53. doi: 10.1006/tpbi.1996.0030.

本文引用的文献

2
Estimation of selection intensity under overdominance by Bayesian methods.用贝叶斯方法估计超显性下的选择强度。
Stat Appl Genet Mol Biol. 2009;8(1):Article32. doi: 10.2202/1544-6115.1466. Epub 2009 Jun 30.
5
Ancestral processes for non-neutral models of complex diseases.复杂疾病非中性模型的祖先过程。
Theor Popul Biol. 2003 Mar;63(2):115-30. doi: 10.1016/s0040-5809(02)00049-7.
6
Approximate Bayesian computation in population genetics.群体遗传学中的近似贝叶斯计算
Genetics. 2002 Dec;162(4):2025-35. doi: 10.1093/genetics/162.4.2025.
7
Two-locus sampling distributions and their application.两位点抽样分布及其应用。
Genetics. 2001 Dec;159(4):1805-17. doi: 10.1093/genetics/159.4.1805.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验