用于多分辨率金曼 - 田岛合并计数的序贯重要性抽样

SEQUENTIAL IMPORTANCE SAMPLING FOR MULTIRESOLUTION KINGMAN-TAJIMA COALESCENT COUNTING.

作者信息

Cappello Lorenzo, Palacios Julia A

机构信息

Stanford University.

出版信息

Ann Appl Stat. 2020 Jun;14(2):727-751. doi: 10.1214/19-AOAS1313.

DOI:10.1214/19-AOAS1313

PMID:33995755

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8118586/

Abstract

Statistical inference of evolutionary parameters from molecular sequence data relies on coalescent models to account for the shared genealogical ancestry of the samples. However, inferential algorithms do not scale to available data sets. A strategy to improve computational efficiency is to rely on simpler coalescent and mutation models, resulting in smaller hidden state spaces. An estimate of the cardinality of the state-space of genealogical trees at different resolutions is essential to decide the best modeling strategy for a given dataset. To our knowledge, there is neither an exact nor approximate method to determine these cardinalities. We propose a sequential importance sampling algorithm to estimate the cardinality of the sample space of genealogical trees under different coalescent resolutions. Our sampling scheme proceeds sequentially across the set of combinatorial constraints imposed by the data, which in this work are completely linked sequences of DNA at a non recombining segment. We analyze the cardinality of different genealogical tree spaces on simulations to study the settings that favor coarser resolutions. We apply our method to estimate the cardinality of genealogical tree spaces from mtDNA data from the 1000 genomes and a sample from a Melanesian population at the -globin locus.

摘要

从分子序列数据推断进化参数依赖于合并模型来解释样本共享的谱系祖先。然而，推理算法无法扩展到可用数据集。提高计算效率的一种策略是依赖更简单的合并和突变模型，从而产生更小的隐藏状态空间。估计不同分辨率下谱系树状态空间的基数对于为给定数据集确定最佳建模策略至关重要。据我们所知，既没有精确方法也没有近似方法来确定这些基数。我们提出一种顺序重要性抽样算法来估计不同合并分辨率下谱系树样本空间的基数。我们的抽样方案按照数据施加的组合约束集顺序进行，在本研究中这些约束是在非重组片段上完全连锁的DNA序列。我们在模拟中分析不同谱系树空间的基数，以研究有利于更粗分辨率的设置。我们应用我们的方法来估计来自千人基因组计划的线粒体DNA数据以及来自美拉尼西亚人群的一个样本在β-珠蛋白基因座处谱系树空间的基数。

相似文献

SEQUENTIAL IMPORTANCE SAMPLING FOR MULTIRESOLUTION KINGMAN-TAJIMA COALESCENT COUNTING.用于多分辨率金曼 - 田岛合并计数的序贯重要性抽样

Ann Appl Stat. 2020 Jun;14(2):727-751. doi: 10.1214/19-AOAS1313.

Finding the best resolution for the Kingman-Tajima coalescent: theory and applications.寻找金曼-田岛合并过程的最佳分辨率：理论与应用

J Math Biol. 2015 May;70(6):1207-47. doi: 10.1007/s00285-014-0796-5. Epub 2014 May 14.

Full likelihood inference from the site frequency spectrum based on the optimal tree resolution.基于最优树分辨率从位点频率谱进行全似然推断。

Theor Popul Biol. 2018 Dec;124:1-15. doi: 10.1016/j.tpb.2018.07.002. Epub 2018 Jul 23.

Bayesian Estimation of Population Size Changes by Sampling Tajima's Trees.贝叶斯估计抽样 Tajima 树的种群大小变化。

Genetics. 2019 Nov;213(3):967-986. doi: 10.1534/genetics.119.302373. Epub 2019 Sep 11.

An efficient algorithm for generating the internal branches of a Kingman coalescent.一种用于生成金曼合并过程内部分支的高效算法。

Theor Popul Biol. 2018 Jul;122:57-66. doi: 10.1016/j.tpb.2017.05.002. Epub 2017 Jul 11.

Topologies of the conditional ancestral trees and full-likelihood-based inference in the general coalescent tree framework.条件祖先树的拓扑结构和广义融合树框架中的全似然推理。

Genetics. 2010 Aug;185(4):1355-68. doi: 10.1534/genetics.109.112847. Epub 2010 May 17.

Discrete coalescent trees.离散融合树。

J Math Biol. 2021 Nov 5;83(5):60. doi: 10.1007/s00285-021-01685-0.

Exact coalescent for the Wright-Fisher model.赖特-费希尔模型的精确合并理论

Theor Popul Biol. 2006 Jun;69(4):385-94. doi: 10.1016/j.tpb.2005.11.005. Epub 2006 Jan 19.

Coalescent: an open-science framework for importance sampling in coalescent theory.合并：一种用于合并理论中重要性抽样的开放科学框架。

PeerJ. 2015 Aug 18;3:e1203. doi: 10.7717/peerj.1203. eCollection 2015.

Exact limits of inference in coalescent models.合并模型中推断的精确界限。

Theor Popul Biol. 2019 Feb;125:75-93. doi: 10.1016/j.tpb.2018.11.004. Epub 2018 Dec 17.

引用本文的文献

Enumeration of binary trees compatible with a perfect phylogeny.枚举与完美系统发育兼容的二叉树。

J Math Biol. 2022 May 12;84(6):54. doi: 10.1007/s00285-022-01748-w.

Bayesian Estimation of Population Size Changes by Sampling Tajima's Trees.贝叶斯估计抽样 Tajima 树的种群大小变化。

Genetics. 2019 Nov;213(3):967-986. doi: 10.1534/genetics.119.302373. Epub 2019 Sep 11.

本文引用的文献

Bayesian Estimation of Population Size Changes by Sampling Tajima's Trees.贝叶斯估计抽样 Tajima 树的种群大小变化。

Genetics. 2019 Nov;213(3):967-986. doi: 10.1534/genetics.119.302373. Epub 2019 Sep 11.

Full likelihood inference from the site frequency spectrum based on the optimal tree resolution.基于最优树分辨率从位点频率谱进行全似然推断。

Theor Popul Biol. 2018 Dec;124:1-15. doi: 10.1016/j.tpb.2018.07.002. Epub 2018 Jul 23.

Ranked Tree Shapes, Nonrandom Extinctions, and the Loss of Phylogenetic Diversity.等级树形状、非随机灭绝与系统发育多样性丧失。

Syst Biol. 2018 Nov 1;67(6):1025-1040. doi: 10.1093/sysbio/syy030.

Decomposing the Site Frequency Spectrum: The Impact of Tree Topology on Neutrality Tests.分解位点频率谱：树拓扑结构对中性检验的影响。

Genetics. 2017 Sep;207(1):229-240. doi: 10.1534/genetics.116.188763. Epub 2017 Jul 5.

Robust and scalable inference of population history from hundreds of unphased whole genomes.基于数百个未分型的全基因组对群体历史进行稳健且可扩展的推断。

Nat Genet. 2017 Feb;49(2):303-309. doi: 10.1038/ng.3748. Epub 2016 Dec 26.

Inferring Past Effective Population Size from Distributions of Coalescent Times.从溯祖时间分布推断过去的有效种群大小

Genetics. 2016 Nov;204(3):1191-1206. doi: 10.1534/genetics.115.185058. Epub 2016 Sep 16.

Inference of Super-exponential Human Population Growth via Efficient Computation of the Site Frequency Spectrum for Generalized Models.通过广义模型位点频率谱的高效计算推断超指数人口增长

Genetics. 2016 Jan;202(1):235-45. doi: 10.1534/genetics.115.180570. Epub 2015 Oct 8.

A global reference for human genetic variation.人类遗传变异的全球参考。

Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.

Finding the best resolution for the Kingman-Tajima coalescent: theory and applications.寻找金曼-田岛合并过程的最佳分辨率：理论与应用

J Math Biol. 2015 May;70(6):1207-47. doi: 10.1007/s00285-014-0796-5. Epub 2014 May 14.

Origin and diversity of novel avian influenza A H7N9 viruses causing human infection: phylogenetic, structural, and coalescent analyses.新型甲型 H7N9 流感病毒致人类感染的起源与多样性：系统发育、结构和合并分析。

Lancet. 2013 Jun 1;381(9881):1926-32. doi: 10.1016/S0140-6736(13)60938-1. Epub 2013 May 1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于多分辨率金曼 - 田岛合并计数的序贯重要性抽样

SEQUENTIAL IMPORTANCE SAMPLING FOR MULTIRESOLUTION KINGMAN-TAJIMA COALESCENT COUNTING.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献