Kern Andrew D
Department of Biological Sciences, Dartmouth College, Hanover, NH, USA.
PLoS One. 2009;4(4):e5152. doi: 10.1371/journal.pone.0005152. Epub 2009 Apr 16.
Comparative genomics based on sequenced referenced genomes is essential to hypothesis generation and testing within population genetics. However, selection of candidate regions for further study on the basis of elevated or depressed divergence between species leads to a divergence-based ascertainment bias in the site frequency spectrum within selected candidate loci. Here, a method to correct this problem is developed that obtains maximum-likelihood estimates of the unascertained allele frequency distribution using numerical optimization. I show how divergence-based ascertainment may mimic the effects of natural selection and offer correction formulae for performing proper estimation into the strength of selection in candidate regions in a maximum-likelihood setting.
基于已测序参考基因组的比较基因组学对于群体遗传学中的假设生成和检验至关重要。然而,基于物种间差异升高或降低来选择进一步研究的候选区域,会导致所选候选基因座内的位点频率谱出现基于差异的确定偏差。在此,开发了一种纠正此问题的方法,该方法使用数值优化来获得未确定等位基因频率分布的最大似然估计。我展示了基于差异的确定如何可能模拟自然选择的效应,并提供了校正公式,以便在最大似然设置下对候选区域中的选择强度进行适当估计。