Departamento de Bioquímica, Genética e Inmunología, Universidad de Vigo, 36310 Vigo, Spain.
Genetics. 2010 Feb;184(2):429-37. doi: 10.1534/genetics.109.109736. Epub 2009 Nov 23.
The coalescent with recombination is a very useful tool in molecular population genetics. Under this framework, genealogies often represent the evolution of the substitution unit, and because of this, the few coalescent algorithms implemented for the simulation of coding sequences force recombination to occur only between codons. However, it is clear that recombination is expected to occur most often within codons. Here we have developed an algorithm that can evolve coding sequences under an ancestral recombination graph that represents the genealogies at each nucleotide site, thereby allowing for intracodon recombination. The algorithm is a modification of Hudson's coalescent in which, in addition to keeping track of events occurring in the ancestral material that reaches the sample, we need to keep track of events occurring in ancestral material that does not reach the sample but that is produced by intracodon recombination. We are able to show that at typical substitution rates the number of nonsynonymous changes induced by intracodon recombination is small and that intracodon recombination does not generally result in inflated estimates of the overall nonsynonymous/synonymous substitution ratio (omega). On the other hand, recombination can bias the estimation of omega at particular codons, resulting in apparent rate variation among sites and in the spurious identification of positively selected sites. Importantly, in this case, allowing for variable synonymous rates across sites greatly reduces the false-positive rate and recovers statistical power. Finally, coalescent simulations with intracodon recombination could be used to better represent the evolution of nuclear coding genes or fast-evolving pathogens such as HIV-1.We have implemented this algorithm in a computer program called NetRecodon, freely available at http://darwin.uvigo.es.
有重组的合并是分子群体遗传学中非常有用的工具。在这个框架下,系统发生通常代表替代单位的进化,正因为如此,为模拟编码序列而实现的少数合并算法强制重组仅发生在密码子之间。然而,很明显,重组预计最常发生在密码子内。在这里,我们开发了一种算法,可以在代表每个核苷酸位点系统发生的祖先重组图下进化编码序列,从而允许密码子内重组。该算法是 Hudson 合并算法的一种修改,除了跟踪到达样本的祖先物质中发生的事件外,我们还需要跟踪未到达样本但由密码子内重组产生的祖先物质中发生的事件。我们能够表明,在典型的替代率下,由密码子内重组诱导的非同义变化数量很小,并且密码子内重组通常不会导致总体非同义/同义替代率(omega)的膨胀估计。另一方面,重组可能会偏置特定密码子中 omega 的估计,导致位点之间明显的速率变化,并错误地识别出正选择的位点。重要的是,在这种情况下,允许站点之间的同义速率变化会大大降低假阳性率并恢复统计能力。最后,带有密码子内重组的合并模拟可以更好地代表核编码基因或快速进化的病原体(如 HIV-1)的进化。我们已经在一个名为 NetRecodon 的计算机程序中实现了这个算法,该程序可在 http://darwin.uvigo.es 免费获得。