Suppr超能文献

揭示混合模型中等可能性估计值的范围。

Revealing the range of equally likely estimates in the admixture model.

作者信息

Heinzel Carola Sophia, Baumdicker Franz, Pfaffelhuber Peter

机构信息

Department of Mathematical Stochastics, Albert-Ludwigs-University Freiburg, Freiburg im Breisgau 79104, Germany.

Cluster of Excellence "Controlling Microbes to Fight Infections", Mathematical and Computational Population Genetics, University of Tübingen, Sand 14, Tübingen 72076, Germany.

出版信息

G3 (Bethesda). 2025 Aug 6;15(8). doi: 10.1093/g3journal/jkaf142.

Abstract

Many ancestry inference tools, including Structure and Admixture, rely on the admixture model to infer both, allele frequencies p and individual admixture proportions q for a collection of individuals relative to a set of hypothetical ancestral populations. We show that under realistic conditions the likelihood in the admixture model is typically flat in some direction around a maximum-likelihood estimate (q^,p^). In particular, the maximum-likelihood estimator is nonunique and there is a complete spectrum of possible estimates. Common inference tools typically identify only a few points within this spectrum. We provide an algorithm which computes the set of equally likely (q,p), when starting from (q^,p^). It is analytic for K=2 ancestral populations and numeric for K>2. We apply our algorithm to data from the 1000 genomes project, and show that inter-European estimators of q can come with a large set of equally likely possibilities. In general, markers with large allele frequency differences between populations in combination with individuals with concentrated admixture proportions lead to small areas with a flat likelihood. Our findings imply that care must be taken when interpreting results from STRUCTURE and ADMIXTURE if populations are not separated well enough.

摘要

许多血统推断工具,包括Structure和Admixture,都依赖于混合模型来推断相对于一组假设的祖先群体的个体集合的等位基因频率p和个体混合比例q。我们表明,在现实条件下,混合模型中的似然性在最大似然估计(q^,p^)周围的某些方向上通常是平坦的。特别是,最大似然估计器是非唯一的,并且存在完整的可能估计范围。常见的推断工具通常只识别该范围内的几个点。我们提供了一种算法,当从(q^,p^)开始时,该算法计算等可能的(q,p)集合。对于K = 2个祖先群体,它是解析的,对于K>2则是数值的。我们将我们的算法应用于千人基因组计划的数据,并表明欧洲内部q的估计值可能伴随着大量等可能的可能性。一般来说,群体之间等位基因频率差异大的标记与混合比例集中的个体相结合,会导致似然性平坦的小区域。我们的发现意味着,如果群体分离得不够好,在解释Structure和Admixture的结果时必须谨慎。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/096a/12341866/aeae92873414/jkaf142f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验