Suppr超能文献

有限状态空间上基于端点条件的连续时间马尔可夫链模拟及其在分子进化中的应用

SIMULATION FROM ENDPOINT-CONDITIONED, CONTINUOUS-TIME MARKOV CHAINS ON A FINITE STATE SPACE, WITH APPLICATIONS TO MOLECULAR EVOLUTION.

作者信息

Hobolth Asger, Stone Eric A

机构信息

Department of Mathematical Sciences, Aarhus University, Denmark.

出版信息

Ann Appl Stat. 2009 Sep 1;3(3):1204. doi: 10.1214/09-AOAS247.

Abstract

Analyses of serially-sampled data often begin with the assumption that the observations represent discrete samples from a latent continuous-time stochastic process. The continuous-time Markov chain (CTMC) is one such generative model whose popularity extends to a variety of disciplines ranging from computational finance to human genetics and genomics. A common theme among these diverse applications is the need to simulate sample paths of a CTMC conditional on realized data that is discretely observed. Here we present a general solution to this sampling problem when the CTMC is defined on a discrete and finite state space. Specifically, we consider the generation of sample paths, including intermediate states and times of transition, from a CTMC whose beginning and ending states are known across a time interval of length T. We first unify the literature through a discussion of the three predominant approaches: (1) modified rejection sampling, (2) direct sampling, and (3) uniformization. We then give analytical results for the complexity and efficiency of each method in terms of the instantaneous transition rate matrix Q of the CTMC, its beginning and ending states, and the length of sampling time T. In doing so, we show that no method dominates the others across all model specifications, and we give explicit proof of which method prevails for any given Q, T, and endpoints. Finally, we introduce and compare three applications of CTMCs to demonstrate the pitfalls of choosing an inefficient sampler.

摘要

对连续采样数据的分析通常始于这样一种假设,即观测值代表来自潜在连续时间随机过程的离散样本。连续时间马尔可夫链(CTMC)就是这样一种生成模型,其应用范围广泛,涵盖从计算金融到人类遗传学和基因组学等多个学科。这些不同应用中的一个共同主题是需要根据离散观测到的已实现数据来模拟CTMC的样本路径。在此,当CTMC定义在离散且有限的状态空间上时,我们给出了这个采样问题的通用解决方案。具体而言,我们考虑从一个在长度为T的时间间隔内起始和结束状态已知的CTMC生成样本路径,包括中间状态和转移时间。我们首先通过讨论三种主要方法来统一相关文献:(1)改进的拒绝采样,(2)直接采样,以及(3)均匀化。然后,我们根据CTMC的瞬时转移率矩阵Q、其起始和结束状态以及采样时间T的长度,给出了每种方法的复杂度和效率的分析结果。通过这样做,我们表明在所有模型规格下没有一种方法能主导其他方法,并且我们给出了在任何给定的Q、T和端点情况下哪种方法占优的明确证明。最后,我们介绍并比较CTMC的三种应用,以展示选择低效采样器的陷阱。

相似文献

2
Phylogenetic stochastic mapping without matrix exponentiation.无需矩阵求幂的系统发生随机映射。
J Comput Biol. 2014 Sep;21(9):676-90. doi: 10.1089/cmb.2014.0062. Epub 2014 Jun 11.
3
Geometric fluid approximation for general continuous-time Markov chains.一般连续时间马尔可夫链的几何流体近似
Proc Math Phys Eng Sci. 2019 Sep;475(2229):20190100. doi: 10.1098/rspa.2019.0100. Epub 2019 Sep 25.

引用本文的文献

3
Bayesian phylodynamic inference of population dynamics with dormancy.具有休眠的种群动态的贝叶斯系统发育动力学推断
Proc Natl Acad Sci U S A. 2025 May 6;122(18):e2501394122. doi: 10.1073/pnas.2501394122. Epub 2025 May 2.
6
Data integration in Bayesian phylogenetics.贝叶斯系统发育学中的数据整合。
Annu Rev Stat Appl. 2023;10:353-377. doi: 10.1146/annurev-statistics-033021-112532. Epub 2022 Sep 28.

本文引用的文献

9
Mapping mutations on phylogenies.在系统发育树上定位突变
Syst Biol. 2002 Oct;51(5):729-39. doi: 10.1080/10635150290102393.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验