Suppr超能文献

用于多分辨率金曼 - 田岛合并计数的序贯重要性抽样

SEQUENTIAL IMPORTANCE SAMPLING FOR MULTIRESOLUTION KINGMAN-TAJIMA COALESCENT COUNTING.

作者信息

Cappello Lorenzo, Palacios Julia A

机构信息

Stanford University.

出版信息

Ann Appl Stat. 2020 Jun;14(2):727-751. doi: 10.1214/19-AOAS1313.

Abstract

Statistical inference of evolutionary parameters from molecular sequence data relies on coalescent models to account for the shared genealogical ancestry of the samples. However, inferential algorithms do not scale to available data sets. A strategy to improve computational efficiency is to rely on simpler coalescent and mutation models, resulting in smaller hidden state spaces. An estimate of the cardinality of the state-space of genealogical trees at different resolutions is essential to decide the best modeling strategy for a given dataset. To our knowledge, there is neither an exact nor approximate method to determine these cardinalities. We propose a sequential importance sampling algorithm to estimate the cardinality of the sample space of genealogical trees under different coalescent resolutions. Our sampling scheme proceeds sequentially across the set of combinatorial constraints imposed by the data, which in this work are completely linked sequences of DNA at a non recombining segment. We analyze the cardinality of different genealogical tree spaces on simulations to study the settings that favor coarser resolutions. We apply our method to estimate the cardinality of genealogical tree spaces from mtDNA data from the 1000 genomes and a sample from a Melanesian population at the -globin locus.

摘要

从分子序列数据推断进化参数依赖于合并模型来解释样本共享的谱系祖先。然而,推理算法无法扩展到可用数据集。提高计算效率的一种策略是依赖更简单的合并和突变模型,从而产生更小的隐藏状态空间。估计不同分辨率下谱系树状态空间的基数对于为给定数据集确定最佳建模策略至关重要。据我们所知,既没有精确方法也没有近似方法来确定这些基数。我们提出一种顺序重要性抽样算法来估计不同合并分辨率下谱系树样本空间的基数。我们的抽样方案按照数据施加的组合约束集顺序进行,在本研究中这些约束是在非重组片段上完全连锁的DNA序列。我们在模拟中分析不同谱系树空间的基数,以研究有利于更粗分辨率的设置。我们应用我们的方法来估计来自千人基因组计划的线粒体DNA数据以及来自美拉尼西亚人群的一个样本在β-珠蛋白基因座处谱系树空间的基数。

相似文献

4
7
Discrete coalescent trees.离散融合树。
J Math Biol. 2021 Nov 5;83(5):60. doi: 10.1007/s00285-021-01685-0.
8
Exact coalescent for the Wright-Fisher model.赖特-费希尔模型的精确合并理论
Theor Popul Biol. 2006 Jun;69(4):385-94. doi: 10.1016/j.tpb.2005.11.005. Epub 2006 Jan 19.
10
Exact limits of inference in coalescent models.合并模型中推断的精确界限。
Theor Popul Biol. 2019 Feb;125:75-93. doi: 10.1016/j.tpb.2018.11.004. Epub 2018 Dec 17.

引用本文的文献

本文引用的文献

1
8
A global reference for human genetic variation.人类遗传变异的全球参考。
Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验