Knudsen Bjarne, Miyamoto Michael M
Department of Zoology, University of Florida, Gainesville, FL 32611-8525, USA.
Genetics. 2007 Aug;176(4):2335-42. doi: 10.1534/genetics.106.063560. Epub 2007 Jun 11.
Coalescent theory provides a powerful framework for estimating the evolutionary, demographic, and genetic parameters of a population from a small sample of individuals. Current coalescent models have largely focused on population genetic factors (e.g., mutation, population growth, and migration) rather than on the effects of experimental design and error. This study develops a new coalescent/mutation model that accounts for unobserved polymorphisms due to missing data, sequence errors, and multiple reads for diploid individuals. The importance of accommodating these effects of experimental design and error is illustrated with evolutionary simulations and a real data set from a population of the California sea hare. In particular, a failure to account for sequence errors can lead to overestimated mutation rates, inflated coalescent times, and inappropriate conclusions about the population. This current model can now serve as a starting point for the development of newer models with additional experimental and population genetic factors. It is currently implemented as a maximum-likelihood method, but this model may also serve as the basis for the development of Bayesian approaches that incorporate experimental design and error.
溯祖理论为从少量个体样本估计种群的进化、人口统计学和遗传参数提供了一个强大的框架。当前的溯祖模型主要关注群体遗传因素(例如,突变、种群增长和迁移),而非实验设计和误差的影响。本研究开发了一种新的溯祖/突变模型,该模型考虑了由于数据缺失、序列错误以及二倍体个体的多次读取而导致的未观察到的多态性。通过进化模拟和来自加利福尼亚海兔种群的真实数据集,说明了考虑实验设计和误差这些影响的重要性。特别是,未能考虑序列错误可能导致突变率高估、溯祖时间膨胀以及对种群得出不恰当的结论。当前的这个模型现在可以作为开发包含更多实验和群体遗传因素的新模型的起点。它目前作为一种最大似然方法来实现,但该模型也可能作为开发纳入实验设计和误差的贝叶斯方法的基础。