• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从任意大种群中进行精确且高效的系统动力学模拟。

Exact and efficient phylodynamic simulation from arbitrarily large populations.

作者信息

Celentano Michael, DeWitt William S, Prillo Sebastian, Song Yun S

机构信息

Department of Statistics, University of California, Berkeley.

Computer Science Division, University of California, Berkeley.

出版信息

ArXiv. 2024 Aug 10:arXiv:2402.17153v2.

PMID:38463501
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10925381/
Abstract

Many biological studies involve inferring the evolutionary history of a sample of individuals from a large population and interpreting the reconstructed tree. Such an ascertained tree typically represents only a small part of a comprehensive population tree and is distorted by survivorship and sampling biases. Inferring evolutionary parameters from ascertained trees requires modeling both the underlying population dynamics and the ascertainment process. A crucial component of this phylodynamic modeling involves tree simulation, which is used to benchmark probabilistic inference methods. To simulate an ascertained tree, one must first simulate the full population tree and then prune unobserved lineages. Consequently, the computational cost is determined not by the size of the final simulated tree, but by the size of the population tree in which it is embedded. In most biological scenarios, simulations of the entire population are prohibitively expensive due to computational demands placed on lineages without sampled descendants. Here, we address this challenge by proving that, for any partially ascertained process from a general multi-type birth-death-mutation-sampling model, there exists an equivalent process with and , a property which we leverage to develop a highly efficient algorithm for simulating trees. Our algorithm scales linearly with the size of the final simulated tree and is independent of the population size, enabling simulations from extremely large populations beyond the reach of current methods but essential for various biological applications. We anticipate that this unprecedented speedup will significantly advance the development of novel inference methods that require extensive training data.

摘要

许多生物学研究涉及从大量种群中推断个体样本的进化历史,并解读重建的树。这样一个确定的树通常只代表综合种群树的一小部分,并且会因生存偏差和抽样偏差而扭曲。从确定的树推断进化参数需要对潜在的种群动态和确定过程进行建模。这种系统发育动力学建模的一个关键组成部分涉及树模拟,它被用于对概率推断方法进行基准测试。为了模拟一个确定的树,必须首先模拟完整的种群树,然后修剪未观察到的谱系。因此,计算成本不是由最终模拟树的大小决定的,而是由它所嵌入的种群树的大小决定的。在大多数生物学场景中,由于对没有抽样后代的谱系的计算需求,对整个种群进行模拟成本过高。在这里,我们通过证明对于一般多类型出生-死亡-突变-抽样模型的任何部分确定过程,都存在一个等效过程,其具有 和 ,我们利用这一特性开发了一种高效的树模拟算法。我们的算法与最终模拟树的大小成线性比例,并且与种群大小无关,能够对当前方法无法处理的极大型种群进行模拟,但这对于各种生物学应用至关重要。我们预计这种前所未有的加速将显著推动需要大量训练数据的新型推断方法的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/4f7dc35256b1/nihpp-2402.17153v2-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/c523d3f4311f/nihpp-2402.17153v2-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/25875b0cc48f/nihpp-2402.17153v2-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/f25ecd9bed06/nihpp-2402.17153v2-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/6f1fd4482604/nihpp-2402.17153v2-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/4f7dc35256b1/nihpp-2402.17153v2-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/c523d3f4311f/nihpp-2402.17153v2-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/25875b0cc48f/nihpp-2402.17153v2-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/f25ecd9bed06/nihpp-2402.17153v2-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/6f1fd4482604/nihpp-2402.17153v2-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4c7/11421513/4f7dc35256b1/nihpp-2402.17153v2-f0008.jpg

相似文献

1
Exact and efficient phylodynamic simulation from arbitrarily large populations.从任意大种群中进行精确且高效的系统动力学模拟。
ArXiv. 2024 Aug 10:arXiv:2402.17153v2.
2
Exact and efficient phylodynamic simulation from arbitrarily large populations.来自任意大种群的精确且高效的系统动力学模拟。
Proc Natl Acad Sci U S A. 2025 May 20;122(20):e2412978122. doi: 10.1073/pnas.2412978122. Epub 2025 May 14.
3
Phylodynamic Inference with Kernel ABC and Its Application to HIV Epidemiology.基于核近似贝叶斯计算的系统动力学推断及其在HIV流行病学中的应用
Mol Biol Evol. 2015 Sep;32(9):2483-95. doi: 10.1093/molbev/msv123. Epub 2015 May 25.
4
An algorithm for computing the gene tree probability under the multispecies coalescent and its application in the inference of population tree.一种用于计算多物种溯祖模型下基因树概率的算法及其在种群树推断中的应用。
Bioinformatics. 2016 Jun 15;32(12):i225-i233. doi: 10.1093/bioinformatics/btw261.
5
Fast stochastic algorithm for simulating evolutionary population dynamics.快速随机进化群体动力学模拟算法。
Bioinformatics. 2012 May 1;28(9):1230-8. doi: 10.1093/bioinformatics/bts130. Epub 2012 Mar 21.
6
Sampling through time and phylodynamic inference with coalescent and birth-death models.通过时间进行采样以及使用溯祖模型和出生-死亡模型进行系统发育动力学推断。
J R Soc Interface. 2014 Dec 6;11(101):20140945. doi: 10.1098/rsif.2014.0945.
7
A scalability study of phylogenetic network inference methods using empirical datasets and simulations involving a single reticulation.一项使用经验数据集和涉及单个网状结构的模拟对系统发育网络推断方法进行的可扩展性研究。
BMC Bioinformatics. 2016 Oct 13;17(1):422. doi: 10.1186/s12859-016-1277-1.
8
The probability distribution of the ancestral population size conditioned on the reconstructed phylogenetic tree with occurrence data.条件于包含发生数据的重建系统发育树的祖先种群规模的概率分布。
J Theor Biol. 2021 Jan 21;509:110400. doi: 10.1016/j.jtbi.2020.110400. Epub 2020 Jul 30.
9
Fast likelihood calculation for multivariate Gaussian phylogenetic models with shifts.具有转移的多元高斯系统发育模型的快速似然计算。
Theor Popul Biol. 2020 Feb;131:66-78. doi: 10.1016/j.tpb.2019.11.005. Epub 2019 Dec 2.
10
Bayesian phylodynamic inference of population dynamics with dormancy.具有休眠的群体动态的贝叶斯系统发育动力学推断
bioRxiv. 2025 Jan 22:2025.01.19.633741. doi: 10.1101/2025.01.19.633741.

本文引用的文献

1
Deep Learning and Likelihood Approaches for Viral Phylogeography Converge on the Same Answers Whether the Inference Model Is Right or Wrong.深度学习和似然方法在病毒系统发育地理学上得出的答案相同,而不管推断模型是正确还是错误。
Syst Biol. 2024 May 27;73(1):183-206. doi: 10.1093/sysbio/syad074.
2
B cell phylogenetics in the single cell era.单细胞时代的 B 细胞系统发生。
Trends Immunol. 2024 Jan;45(1):62-74. doi: 10.1016/j.it.2023.11.004. Epub 2023 Dec 27.
3
Identifiability and inference of phylogenetic birth-death models.系统发生出生-死亡模型的可识别性与推断
J Theor Biol. 2023 Jul 7;568:111520. doi: 10.1016/j.jtbi.2023.111520. Epub 2023 May 4.
4
Lineage tracing reveals B cell antibody class switching is stochastic, cell-autonomous, and tuneable.谱系追踪揭示 B 细胞抗体类别转换是随机的、细胞自主的且可调节的。
Immunity. 2022 Oct 11;55(10):1843-1855.e6. doi: 10.1016/j.immuni.2022.08.004. Epub 2022 Sep 14.
5
A class of identifiable phylogenetic birth-death models.一类可识别的系统发生 Birth-Death 模型。
Proc Natl Acad Sci U S A. 2022 Aug 30;119(35):e2119513119. doi: 10.1073/pnas.2119513119. Epub 2022 Aug 22.
6
Lineage tracing reveals the phylodynamics, plasticity, and paths of tumor evolution.谱系追踪揭示了肿瘤进化的系统发育动力学、可塑性和途径。
Cell. 2022 May 26;185(11):1905-1923.e25. doi: 10.1016/j.cell.2022.04.015. Epub 2022 May 5.
7
Unifying Phylogenetic Birth-Death Models in Epidemiology and Macroevolution.统一流行病学和宏观进化中的系统发生发生-灭绝模型。
Syst Biol. 2021 Dec 16;71(1):172-189. doi: 10.1093/sysbio/syab049.
8
Single-cell lineage tracing of metastatic cancer reveals selection of hybrid EMT states.单细胞谱系追踪转移性癌症揭示了杂交 EMT 状态的选择。
Cancer Cell. 2021 Aug 9;39(8):1150-1162.e9. doi: 10.1016/j.ccell.2021.05.005. Epub 2021 Jun 10.
9
Limited Predictability of Amino Acid Substitutions in Seasonal Influenza Viruses.季节性流感病毒中氨基酸替换的有限可预测性。
Mol Biol Evol. 2021 Jun 25;38(7):2767-2777. doi: 10.1093/molbev/msab065.
10
Single-cell lineages reveal the rates, routes, and drivers of metastasis in cancer xenografts.单细胞谱系揭示了癌症异种移植中转移的速率、途径和驱动因素。
Science. 2021 Feb 26;371(6532). doi: 10.1126/science.abc1944. Epub 2021 Jan 21.