Suppr超能文献

快速准确地推断群体和/或个体的亲缘关系参数。

Fast and accurate joint inference of coancestry parameters for populations and/or individuals.

机构信息

MIA-Paris, INRAE, AgroParisTech, Université Paris-Saclay, Palaiseau, France.

Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Génétique Quantitative et Evolution-Le Moulon, Gif-sur-Yvette, France.

出版信息

PLoS Genet. 2023 Jan 19;19(1):e1010054. doi: 10.1371/journal.pgen.1010054. eCollection 2023 Jan.

Abstract

We introduce a fast, new algorithm for inferring from allele count data the FST parameters describing genetic distances among a set of populations and/or unrelated diploid individuals, and a tree with branch lengths corresponding to FST values. The tree can reflect historical processes of splitting and divergence, but seeks to represent the actual genetic variance as accurately as possible with a tree structure. We generalise two major approaches to defining FST, via correlations and mismatch probabilities of sampled allele pairs, which measure shared and non-shared components of genetic variance. A diploid individual can be treated as a population of two gametes, which allows inference of coancestry coefficients for individuals as well as for populations, or a combination of the two. A simulation study illustrates that our fast method-of-moments estimation of FST values, simultaneously for multiple populations/individuals, gains statistical efficiency over pairwise approaches when the population structure is close to tree-like. We apply our approach to genome-wide genotypes from the 26 worldwide human populations of the 1000 Genomes Project. We first analyse at the population level, then a subset of individuals and in a final analysis we pool individuals from the more homogeneous populations. This flexible analysis approach gives advantages over traditional approaches to population structure/coancestry, including visual and quantitative assessments of long-standing questions about the relative magnitudes of within- and between-population genetic differences.

摘要

我们介绍了一种快速、新颖的算法,用于根据等位基因计数数据推断描述一组种群和/或无关二倍体个体之间遗传距离的 FST 参数,以及具有对应于 FST 值的分支长度的树。该树可以反映分裂和分歧的历史过程,但力求用树结构尽可能准确地表示实际遗传方差。我们推广了通过样本等位基因对的相关性和不匹配概率来定义 FST 的两种主要方法,这两种方法分别衡量遗传方差的共享和非共享分量。二倍体个体可以被视为两个配子的群体,这允许对个体以及对种群推断同系系数,或者两者的组合。一项模拟研究表明,当群体结构接近树状时,我们的快速矩估计方法同时对多个种群/个体进行 FST 值的估计,相对于成对方法具有更高的统计效率。我们将我们的方法应用于来自 1000 基因组计划的 26 个人类全球人群的全基因组基因型。我们首先在群体水平上进行分析,然后在个体子集上进行分析,最后在更同质的群体中合并个体进行分析。这种灵活的分析方法相对于传统的群体结构/同系系数分析方法具有优势,包括对有关群体内和群体间遗传差异相对大小的长期问题的直观和定量评估。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dca9/9888729/650804fd0d3f/pgen.1010054.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验