CMPG, Institute for Ecology and Evolution, University of Bern, Berne, Switzerland.
Swiss Institute of Bioinformatics, Lausanne, Switzerland.
Mol Ecol Resour. 2024 Aug;24(6):e13981. doi: 10.1111/1755-0998.13981. Epub 2024 May 22.
Admixture is a common biological phenomenon among populations of the same or different species. Identifying admixed tracts within individual genomes can provide valuable information to date admixture events, reconstruct ancestry-specific demographic histories, or detect adaptive introgression, genetic incompatibilities, as well as regions of the genomes affected by (associative-) overdominance. Although many local ancestry inference (LAI) methods have been developed in the last decade, their performance was accessed using large reference panels, which are rarely available for non-model organisms or ancient samples. Moreover, the demographic conditions for which LAI becomes unreliable have not been explicitly outlined. Here, we identify the demographic conditions for which local ancestries can be best estimated using very small reference panels. Furthermore, we compare the performance of two LAI methods (RFMix and MOSAIC) with the performance of a newly developed approach (simpLAI) that can be used even when reference populations consist of single individuals. Based on simulations of various demographic models, we also determine the limits of these LAI tools and propose post-painting filtering steps to reduce false-positive rates and improve the precision and accuracy of the inferred admixed tracts. Besides providing a guide for using LAI, our work shows that reasonable inferences can be obtained from a single diploid genome per reference under demographic conditions that are not uncommon among past human groups and non-model organisms.
混合是同一或不同物种的种群中常见的生物学现象。鉴定个体基因组中的混合区域可以提供有价值的信息,以确定混合事件、重建特定祖先的人口历史,或检测适应性渗入、遗传不相容性,以及受(关联)超显性影响的基因组区域。尽管在过去十年中已经开发了许多局部祖先推断 (LAI) 方法,但它们的性能是使用大型参考面板评估的,这些面板很少可用于非模式生物或古代样本。此外,LAI 变得不可靠的人口统计条件尚未明确概述。在这里,我们确定了使用非常小的参考面板可以最好地估计局部祖先的人口统计条件。此外,我们比较了两种 LAI 方法(RFMix 和 MOSAIC)与新开发的方法(simpLAI)的性能,即使参考群体由单个个体组成,也可以使用 simpLAI 方法。基于对各种人口统计模型的模拟,我们还确定了这些 LAI 工具的限制,并提出了后绘画过滤步骤,以降低假阳性率并提高推断出的混合区域的精度和准确性。除了为使用 LAI 提供指导外,我们的工作还表明,在过去人类群体和非模式生物中常见的人口统计条件下,即使每个参考只有一个二倍体基因组,也可以进行合理的推断。