Smith Chris C R, Kern Andrew D
Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA.
bioRxiv. 2023 Oct 5:2023.07.30.551115. doi: 10.1101/2023.07.30.551115.
Spatial genetic variation is shaped in part by an organism's dispersal ability. We present a deep learning tool, disperseNN2, for estimating the mean per-generation dispersal distance from georeferenced polymorphism data. Our neural network performs feature extraction on pairs of genotypes, and uses the geographic information that comes with each sample. These attributes led disperseNN2 to outperform a state-of-the-art deep learning method that does not use explicit spatial information: the mean relative absolute error was reduced by 33% and 48% using sample sizes of 10 and 100 individuals, respectively. disperseNN2 is particularly useful for non-model organisms or systems with sparse genomic resources, as it uses unphased, single nucleotide polymorphisms as its input. The software is open source and available from https://github.com/kr-colab/disperseNN2, with documentation located at https://dispersenn2.readthedocs.io/en/latest/.
空间遗传变异部分由生物体的扩散能力塑造。我们提出了一种深度学习工具disperseNN2,用于从地理参考的多态性数据估计每代的平均扩散距离。我们的神经网络对基因型对进行特征提取,并使用每个样本附带的地理信息。这些特性使得disperseNN2优于一种不使用明确空间信息的先进深度学习方法:分别使用10个和100个个体的样本量时,平均相对绝对误差降低了33%和48%。disperseNN2对于非模式生物或基因组资源稀少的系统特别有用,因为它使用未分型的单核苷酸多态性作为输入。该软件是开源的,可从https://github.com/kr-colab/disperseNN2获取,文档位于https://dispersenn2.readthedocs.io/en/latest/。