Suppr超能文献

使用 f 统计量对非洲人群历史进行建模时,应用所有先前提出的 SNP 确定方案会产生偏差。

Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes.

机构信息

Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America.

Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia.

出版信息

PLoS Genet. 2023 Sep 7;19(9):e1010931. doi: 10.1371/journal.pgen.1010931. eCollection 2023 Sep.

Abstract

f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data-that is, all single nucleotide polymorphisms (SNPs) in the individuals being analyzed-but they are also guaranteed to allow robust tests of models for SNPs ascertained as polymorphic in a population that is an outgroup in a phylogenetic sense to all groups being analyzed. True "outgroup ascertainment" is in practice impossible in humans because our species has arisen from a substructured ancestral population that does not descend from a homogeneous ancestral population going back many hundreds of thousands of years into the past. However, initial studies suggested that non-outgroup-ascertainment schemes might produce robust enough results using f-statistics, and that motivated widespread fitting of models to data using non-outgroup-ascertained SNP panels such as the "Affymetrix Human Origins array" which has been genotyped on thousands of modern individuals from hundreds of populations, or the "1240k" in-solution enrichment reagent which has been the source of about 70% of published genome-wide data for ancient humans. In this study, we show that while analyses of population history using such panels work well for studies of relationships among non-African populations and one African outgroup, when co-modeling more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans), fitting of f-statistics to such SNP sets is expected to frequently lead to false rejection of true demographic histories, and failure to reject incorrect models. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, has limited statistical power and retains important biases. However, by carrying out simulations of diverse demographic histories, we show that bias in inferences based on f-statistics can be minimized by ascertaining on variants common in a union of diverse African groups; such ascertainment retains high statistical power while allowing co-analysis of archaic and modern groups.

摘要

f 统计量已成为从全基因组数据推断人口历史的首选分析方法。不仅在分析全基因组测序数据时,它们可以保证对群体历史提出的模型与数据的拟合进行稳健的检验——即分析个体中所有的单核苷酸多态性(SNP)——而且还可以保证对作为所有被分析群体的外群群体中确定为多态性的 SNP 模型进行稳健的检验。在实践中,真正的“外群确定”在人类中是不可能的,因为我们的物种是从一个亚结构的祖先群体中产生的,这个祖先群体不是从一个在过去几十万年中一直存在的同质祖先群体中衍生出来的。然而,最初的研究表明,非外群确定方案可能使用 f 统计量产生足够稳健的结果,这促使人们广泛地使用非外群确定的 SNP 面板拟合模型到数据中,例如“Affymetrix 人类起源阵列”,该阵列已经在来自数百个群体的数千个现代个体中进行了基因分型,或者“1240k”溶液内富集试剂,该试剂是大约 70%的已发表的古代人类全基因组数据的来源。在这项研究中,我们表明,虽然使用这些面板进行人口历史分析对于研究非非洲人群体之间的关系以及一个非洲外群体非常有效,但当同时建模超过一个撒哈拉以南非洲和/或古代人类群体(尼安德特人和丹尼索万人)时,拟合 f 统计量到这样的 SNP 集合预计会经常导致对真实人口历史的错误拒绝,并且不能拒绝不正确的模型。分析古人类多态性的 SNP 面板已被提议作为确定问题的解决方案,但这种方法统计能力有限,并且保留了重要的偏差。然而,通过进行各种人口历史的模拟,我们表明,基于 f 统计量的推断偏差可以通过在多样化的非洲群体联盟中常见的变体进行确定来最小化;这种确定方法保持了高统计能力,同时允许对古代和现代群体进行共同分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3232/10508636/3ee5b6189070/pgen.1010931.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验