• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

诱变树混合物的模型选择。

Model selection for mixtures of mutagenetic trees.

作者信息

Yin Junming, Beerenwinkel Niko, Rahnenführer Jörg, Lengauer Thomas

机构信息

Department of EECS, University of California, Berkeley, CA, USA.

出版信息

Stat Appl Genet Mol Biol. 2006;5:Article17. doi: 10.2202/1544-6115.1164. Epub 2006 Jun 23.

DOI:10.2202/1544-6115.1164
PMID:17049028
Abstract

The evolution of drug resistance in HIV is characterized by the accumulation of resistance-associated mutations in the HIV genome. Mutagenetic trees, a family of restricted Bayesian tree models, have been applied to infer the order and rate of occurrence of these mutations. Understanding and predicting this evolutionary process is an important prerequisite for the rational design of antiretroviral therapies. In practice, mixtures models of K mutagenetic trees provide more flexibility and are often more appropriate for modelling observed mutational patterns. Here, we investigate the model selection problem for K-mutagenetic trees mixture models. We evaluate several classical model selection criteria including cross-validation, the Bayesian Information Criterion (BIC), and the Akaike Information Criterion. We also use the empirical Bayes method by constructing a prior probability distribution for the parameters of a mutagenetic trees mixture model and deriving the posterior probability of the model. In addition to the model dimension, we consider the redundancy of a mixture model, which is measured by comparing the topologies of trees within a mixture model. Based on the redundancy, we propose a new model selection criterion, which is a modification of the BIC. Experimental results on simulated and on real HIV data show that the classical criteria tend to select models with far too many tree components. Only cross-validation and the modified BIC recover the correct number of trees and the tree topologies most of the time. At the same optimal performance, the runtime of the new BIC modification is about one order of magnitude lower. Thus, this model selection criterion can also be used for large data sets for which cross-validation becomes computationally infeasible.

摘要

HIV耐药性的演变以HIV基因组中耐药相关突变的积累为特征。诱变树是一类受限的贝叶斯树模型,已被用于推断这些突变发生的顺序和速率。理解和预测这一进化过程是合理设计抗逆转录病毒疗法的重要前提。在实际应用中,K个诱变树的混合模型提供了更大的灵活性,通常更适合对观察到的突变模式进行建模。在此,我们研究K个诱变树混合模型的模型选择问题。我们评估了几种经典的模型选择标准,包括交叉验证、贝叶斯信息准则(BIC)和赤池信息准则。我们还通过为诱变树混合模型的参数构建先验概率分布并推导模型的后验概率来使用经验贝叶斯方法。除了模型维度,我们还考虑了混合模型的冗余性,它通过比较混合模型内树的拓扑结构来衡量。基于冗余性,我们提出了一种新的模型选择标准,它是对BIC的一种修改。在模拟的HIV数据和真实的HIV数据上的实验结果表明,经典标准往往会选择树组件过多的模型。只有交叉验证和修改后的BIC在大多数情况下能够恢复正确的树数量和树拓扑结构。在相同的最佳性能下,新的BIC修改版本的运行时间大约低一个数量级。因此,这种模型选择标准也可用于交叉验证在计算上不可行的大数据集。

相似文献

1
Model selection for mixtures of mutagenetic trees.诱变树混合物的模型选择。
Stat Appl Genet Mol Biol. 2006;5:Article17. doi: 10.2202/1544-6115.1164. Epub 2006 Jun 23.
2
Does choice in model selection affect maximum likelihood analysis?模型选择中的选择会影响最大似然分析吗?
Syst Biol. 2008 Feb;57(1):76-85. doi: 10.1080/10635150801898920.
3
The effect of branch length variation on the selection of models of molecular evolution.分支长度变异对分子进化模型选择的影响。
J Mol Evol. 2001 May;52(5):434-44. doi: 10.1007/s002390010173.
4
Accommodating uncertainty in a tree set for function estimation.在用于函数估计的树集里容纳不确定性。
Stat Appl Genet Mol Biol. 2008;7(1):Article5. doi: 10.2202/1544-6115.1324. Epub 2008 Feb 19.
5
Estimating HIV evolutionary pathways and the genetic barrier to drug resistance.估计HIV的进化途径和耐药性的遗传屏障。
J Infect Dis. 2005 Jun 1;191(11):1953-60. doi: 10.1086/430005. Epub 2005 Apr 28.
6
Mtreemix: a software package for learning and using mixture models of mutagenetic trees.Mtreemix:一个用于学习和使用诱变树混合模型的软件包。
Bioinformatics. 2005 May 1;21(9):2106-7. doi: 10.1093/bioinformatics/bti274. Epub 2005 Jan 18.
7
A model of directional selection applied to the evolution of drug resistance in HIV-1.一种应用于HIV-1耐药性进化的定向选择模型。
Mol Biol Evol. 2007 Apr;24(4):1025-31. doi: 10.1093/molbev/msm021. Epub 2007 Feb 1.
8
Fair-balance paradox, star-tree paradox, and Bayesian phylogenetics.公平平衡悖论、星树悖论与贝叶斯系统发育学
Mol Biol Evol. 2007 Aug;24(8):1639-55. doi: 10.1093/molbev/msm081. Epub 2007 May 7.
9
Species trees from gene trees: reconstructing Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions.从基因树构建物种树:利用估计的基因树分布重建物种系统发育的贝叶斯后验分布。
Syst Biol. 2007 Jun;56(3):504-14. doi: 10.1080/10635150701429982.
10
A Bayesian model comparison approach to inferring positive selection.一种用于推断正选择的贝叶斯模型比较方法。
Mol Biol Evol. 2005 Dec;22(12):2531-40. doi: 10.1093/molbev/msi250. Epub 2005 Aug 24.

引用本文的文献

1
Construction of oncogenetic tree models reveals multiple pathways of oral cancer progression.肿瘤发生树模型的构建揭示了口腔癌进展的多种途径。
Int J Cancer. 2009 Jun 15;124(12):2864-71. doi: 10.1002/ijc.24267.
2
Stability analysis of mixtures of mutagenetic trees.诱变树混合物的稳定性分析
BMC Bioinformatics. 2008 Mar 26;9:165. doi: 10.1186/1471-2105-9-165.