Institute of Ecology and Evolution, University of Oregon, Eugene, OR.
Mol Biol Evol. 2020 Jun 1;37(6):1790-1808. doi: 10.1093/molbev/msaa038.
Accurately inferring the genome-wide landscape of recombination rates in natural populations is a central aim in genomics, as patterns of linkage influence everything from genetic mapping to understanding evolutionary history. Here, we describe recombination landscape estimation using recurrent neural networks (ReLERNN), a deep learning method for estimating a genome-wide recombination map that is accurate even with small numbers of pooled or individually sequenced genomes. Rather than use summaries of linkage disequilibrium as its input, ReLERNN takes columns from a genotype alignment, which are then modeled as a sequence across the genome using a recurrent neural network. We demonstrate that ReLERNN improves accuracy and reduces bias relative to existing methods and maintains high accuracy in the face of demographic model misspecification, missing genotype calls, and genome inaccessibility. We apply ReLERNN to natural populations of African Drosophila melanogaster and show that genome-wide recombination landscapes, although largely correlated among populations, exhibit important population-specific differences. Lastly, we connect the inferred patterns of recombination with the frequencies of major inversions segregating in natural Drosophila populations.
准确推断自然种群中全基因组重组率景观是基因组学的一个核心目标,因为连锁模式影响从遗传图谱到了解进化历史的所有方面。在这里,我们描述了使用递归神经网络(ReLERNN)进行重组景观估计,这是一种用于估计全基因组重组图谱的深度学习方法,即使使用少量的混合或单独测序的基因组,它也非常准确。ReLERNN 不是使用连锁不平衡的摘要作为其输入,而是从基因型比对中获取列,然后使用递归神经网络将其建模为基因组上的序列。我们证明,与现有方法相比,ReLERNN 提高了准确性并降低了偏差,并且即使在人口模型指定不正确、缺失基因型调用和基因组不可访问的情况下,仍能保持高精度。我们将 ReLERNN 应用于非洲果蝇的自然种群,并表明尽管全基因组重组景观在种群之间有很大的相关性,但仍存在重要的种群特异性差异。最后,我们将推断出的重组模式与自然果蝇种群中分离的主要倒位的频率联系起来。