Suppr超能文献

可扩展的贝叶斯分歧时间估计与比率变换。

Scalable Bayesian Divergence Time Estimation With Ratio Transformations.

机构信息

Department of Mathematics, School of Science & Engineering, Tulane University, 6823 St. Charles Avenue, New Orleans, LA 70118, USA.

Department of Statistical Science, Duke University, 214 Old Chemistry, Durham, NC 27708, USA.

出版信息

Syst Biol. 2023 Nov 1;72(5):1136-1153. doi: 10.1093/sysbio/syad039.

Abstract

Divergence time estimation is crucial to provide temporal signals for dating biologically important events from species divergence to viral transmissions in space and time. With the advent of high-throughput sequencing, recent Bayesian phylogenetic studies have analyzed hundreds to thousands of sequences. Such large-scale analyses challenge divergence time reconstruction by requiring inference on highly correlated internal node heights that often become computationally infeasible. To overcome this limitation, we explore a ratio transformation that maps the original $N-1$ internal node heights into a space of one height parameter and $N-2$ ratio parameters. To make the analyses scalable, we develop a collection of linear-time algorithms to compute the gradient and Jacobian-associated terms of the log-likelihood with respect to these ratios. We then apply Hamiltonian Monte Carlo sampling with the ratio transform in a Bayesian framework to learn the divergence times in 4 pathogenic viruses (West Nile virus, rabies virus, Lassa virus, and Ebola virus) and the coralline red algae. Our method both resolves a mixing issue in the West Nile virus example and improves inference efficiency by at least 5-fold for the Lassa and rabies virus examples as well as for the algae example. Our method now also makes it computationally feasible to incorporate mixed-effects molecular clock models for the Ebola virus example, confirms the findings from the original study, and reveals clearer multimodal distributions of the divergence times of some clades of interest.

摘要

分歧时间估计对于提供生物重要事件的时间信号至关重要,这些事件从物种分歧到病毒在时空上的传播都可以通过分歧时间估计来确定。随着高通量测序的出现,最近的贝叶斯系统发育研究已经分析了数百到数千个序列。这种大规模的分析通过要求对高度相关的内部节点高度进行推断来挑战分歧时间重建,而这些高度往往在计算上变得不可行。为了克服这一限制,我们探索了一种比例变换,将原始的 $N-1$ 个内部节点高度映射到一个高度参数和 $N-2$ 个比例参数的空间中。为了使分析具有可扩展性,我们开发了一系列线性时间算法来计算对数似然相对于这些比例的梯度和雅可比关联项。然后,我们在贝叶斯框架中应用带有比例变换的 Hamiltonian 蒙特卡罗抽样来学习 4 种致病性病毒(西尼罗河病毒、狂犬病病毒、拉萨病毒和埃博拉病毒)和珊瑚藻的分歧时间。我们的方法既解决了西尼罗河病毒示例中的混合问题,又提高了拉沙病毒和狂犬病病毒示例以及藻类示例的推断效率至少 5 倍。我们的方法现在还使得对埃博拉病毒示例进行混合效应分子钟模型的计算成为可能,证实了原始研究的发现,并揭示了一些感兴趣的分支的分歧时间的更清晰的多峰分布。

相似文献

4
Scalable Bayesian phylogenetics.可扩展的贝叶斯系统发生学。
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210242. doi: 10.1098/rstb.2021.0242. Epub 2022 Aug 22.
6
Sequential Bayesian Phylogenetic Inference.序贯贝叶斯系统发育推断。
Syst Biol. 2024 Oct 25;73(4):704-721. doi: 10.1093/sysbio/syae020.
8
Relaxed Random Walks at Scale.大规模松弛随机游走。
Syst Biol. 2021 Feb 10;70(2):258-267. doi: 10.1093/sysbio/syaa056.

引用本文的文献

9
Scalable Bayesian phylogenetics.可扩展的贝叶斯系统发生学。
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210242. doi: 10.1098/rstb.2021.0242. Epub 2022 Aug 22.

本文引用的文献

1
Data integration in Bayesian phylogenetics.贝叶斯系统发育学中的数据整合。
Annu Rev Stat Appl. 2023;10:353-377. doi: 10.1146/annurev-statistics-033021-112532. Epub 2022 Sep 28.
6
Relaxed Random Walks at Scale.大规模松弛随机游走。
Syst Biol. 2021 Feb 10;70(2):258-267. doi: 10.1093/sysbio/syaa056.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验