Suppr超能文献

进化森林算法

The evolutionary forest algorithm.

作者信息

Leman Scotland C, Uyenoyama Marcy K, Lavine Michael, Chen Yuguo

机构信息

Institute of Statistics and Decision Sciences, Duke University, Durham, NC, USA.

出版信息

Bioinformatics. 2007 Aug 1;23(15):1962-8. doi: 10.1093/bioinformatics/btm264. Epub 2007 May 22.

Abstract

MOTIVATION

Gene genealogies offer a powerful context for inferences about the evolutionary process based on presently segregating DNA variation. In many cases, it is the distribution of population parameters, marginalized over the effectively infinite-dimensional tree space, that is of interest. Our evolutionary forest (EF) algorithm uses Monte Carlo methods to generate posterior distributions of population parameters. A novel feature is the updating of parameter values based on a probability measure defined on an ensemble of histories (a forest of genealogies), rather than a single tree.

RESULTS

The EF algorithm generates samples from the correct marginal distribution of population parameters. Applied to actual data from closely related fruit fly species, it rapidly converged to posterior distributions that closely approximated the exact posteriors generated through massive computational effort. Applied to simulated data, it generated credible intervals that covered the actual parameter values in accordance with the nominal probabilities.

AVAILABILITY

A C++ implementation of this method is freely accessible at http://www.isds.duke.edu/~scl13

摘要

动机

基因谱系为基于当前分离的DNA变异推断进化过程提供了一个强大的背景。在许多情况下,感兴趣的是在有效无限维树空间上边缘化的群体参数分布。我们的进化森林(EF)算法使用蒙特卡罗方法生成群体参数的后验分布。一个新颖的特点是基于在一组历史(基因谱系森林)上定义的概率测度更新参数值,而不是基于单个树。

结果

EF算法从群体参数的正确边际分布中生成样本。应用于密切相关果蝇物种的实际数据时,它迅速收敛到后验分布,该分布与通过大量计算努力生成的精确后验分布非常接近。应用于模拟数据时,它生成的可信区间根据标称概率覆盖了实际参数值。

可用性

此方法的C++实现可从http://www.isds.duke.edu/~scl13免费获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验