Department of Biology, Stanford University, Stanford, CA, USA.
Centro de Investigaciones Científicas de las Huastecas "Aguazarca", Calnali, Hidalgo, Mexico.
Mol Ecol Resour. 2020 Jul;20(4):1141-1151. doi: 10.1111/1755-0998.13175. Epub 2020 May 25.
It has become clear that hybridization between species is much more common than previously recognized. As a result, we now know that the genomes of many modern species, including our own, are a patchwork of regions derived from past hybridization events. Increasingly researchers are interested in disentangling which regions of the genome originated from each parental species using local ancestry inference methods. Due to the diverse effects of admixture, this interest is shared across disparate fields, from human genetics to research in ecology and evolutionary biology. However, local ancestry inference methods are sensitive to a range of biological and technical parameters which can impact accuracy. Here we present paired simulation and ancestry inference pipelines, mixnmatch and ancestryinfer, to help researchers plan and execute local ancestry inference studies. mixnmatch can simulate arbitrarily complex demographic histories in the parental and hybrid populations, selection on hybrids, and technical variables such as coverage and contamination. ancestryinfer takes as input sequencing reads from simulated or real individuals, and implements an efficient local ancestry inference pipeline. We perform a series of simulations with mixnmatch to pinpoint factors that influence accuracy in local ancestry inference and highlight useful features of the two pipelines. mixnmatch is a powerful tool for simulations of hybridization while ancestryinfer facilitates local ancestry inference on real or simulated data.
很明显,物种之间的杂交比以前认为的要普遍得多。因此,我们现在知道,包括人类在内的许多现代物种的基因组都是过去杂交事件产生的区域拼凑而成的。越来越多的研究人员对使用局部祖先推断方法来解开基因组的哪些区域来自每个亲本物种感兴趣。由于混合的各种影响,这种兴趣在从人类遗传学到生态学和进化生物学等不同领域都有分享。然而,局部祖先推断方法对一系列生物和技术参数很敏感,这些参数会影响准确性。在这里,我们提出了配对模拟和祖先推断管道 mixnmatch 和 ancestryinfer,以帮助研究人员计划和执行局部祖先推断研究。mixnmatch 可以模拟父系和杂交群体中任意复杂的人口动态历史、对杂种的选择以及覆盖范围和污染等技术变量。ancestryinfer 以模拟或真实个体的测序读数为输入,并实现了一个有效的局部祖先推断管道。我们使用 mixnmatch 进行了一系列模拟,以确定影响局部祖先推断准确性的因素,并突出了两个管道的有用功能。mixnmatch 是模拟杂交的强大工具,而 ancestryinfer 则方便对真实或模拟数据进行局部祖先推断。