• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对灵长类非编码序列的上下文相关模型中的祖先序列分布和模型频率进行建模。

Modelling the ancestral sequence distribution and model frequencies in context-dependent models for primate non-coding sequences.

机构信息

Department of Plant Systems Biology, VIB, B-9052 Ghent, Belgium.

出版信息

BMC Evol Biol. 2010 Aug 10;10:244. doi: 10.1186/1471-2148-10-244.

DOI:10.1186/1471-2148-10-244
PMID:20698960
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2928787/
Abstract

BACKGROUND

Recent approaches for context-dependent evolutionary modelling assume that the evolution of a given site depends upon its ancestor and that ancestor's immediate flanking sites. Because such dependency pattern cannot be imposed on the root sequence, we consider the use of different orders of Markov chains to model dependence at the ancestral root sequence. Root distributions which are coupled to the context-dependent model across the underlying phylogenetic tree are deemed more realistic than decoupled Markov chains models, as the evolutionary process is responsible for shaping the composition of the ancestral root sequence.

RESULTS

We find strong support, in terms of Bayes Factors, for using a second-order Markov chain at the ancestral root sequence along with a context-dependent model throughout the remainder of the phylogenetic tree in an ancestral repeats dataset, and for using a first-order Markov chain at the ancestral root sequence in a pseudogene dataset. Relaxing the assumption of a single context-independent set of independent model frequencies as presented in previous work, yields a further drastic increase in model fit. We show that the substitution rates associated with the CpG-methylation-deamination process can be modelled through context-dependent model frequencies and that their accuracy depends on the (order of the) Markov chain imposed at the ancestral root sequence. In addition, we provide evidence that this approach (which assumes that root distribution and evolutionary model are decoupled) outperforms an approach inspired by the work of Arndt et al., where the root distribution is coupled to the evolutionary model. We show that the continuous-time approximation of Hwang and Green has stronger support in terms of Bayes Factors, but the parameter estimates show minimal differences.

CONCLUSIONS

We show that the combination of a dependency scheme at the ancestral root sequence and a context-dependent evolutionary model across the remainder of the tree allows for accurate estimation of the model's parameters. The different assumptions tested in this manuscript clearly show that designing accurate context-dependent models is a complex process, with many different assumptions that require validation. Further, these assumptions are shown to change across different datasets, making the search for an adequate model for a given dataset quite challenging.

摘要

背景

最近的语境相关进化模型方法假设给定位置的进化取决于其祖先及其祖先的直接侧翼位置。由于这种依赖模式不能强加于根序列,因此我们考虑使用不同阶的马尔可夫链来对祖先根序列的依赖关系进行建模。在基础系统发育树中与语境相关模型耦合的根分布被认为比解耦的马尔可夫链模型更现实,因为进化过程负责塑造祖先根序列的组成。

结果

在祖先重复数据集的根序列中使用二阶马尔可夫链和语境相关模型,在假基因数据集的根序列中使用一阶马尔可夫链,我们发现了很强的支持,这是基于贝叶斯因子的。放宽之前工作中提出的单个独立模型频率的语境独立集的假设,会导致模型拟合度的进一步大幅提高。我们表明,与 CpG-甲基化脱氨酶过程相关的取代率可以通过语境相关模型频率来建模,并且其准确性取决于在祖先根序列上施加的(阶数的)马尔可夫链。此外,我们提供了证据表明,这种方法(假设根分布和进化模型是解耦的)优于受 Arndt 等人工作启发的方法,其中根分布与进化模型耦合。我们表明,Hwang 和 Green 的连续时间逼近在贝叶斯因子方面具有更强的支持,但参数估计显示出最小的差异。

结论

我们表明,在祖先根序列上的依赖方案和在树的其余部分的语境相关进化模型的组合允许对模型参数进行准确估计。本文测试的不同假设清楚地表明,设计准确的语境相关模型是一个复杂的过程,有许多不同的假设需要验证。此外,这些假设在不同的数据集之间发生变化,使得为给定数据集寻找合适的模型变得极具挑战性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/8dbb203ffa07/1471-2148-10-244-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/7a54b42d5926/1471-2148-10-244-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/642fdd3d7237/1471-2148-10-244-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/cb6f3eae0417/1471-2148-10-244-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/a61b8aa7abed/1471-2148-10-244-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/b9ab1dd050ba/1471-2148-10-244-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/0c45e033f80b/1471-2148-10-244-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/4dc1b9a372fa/1471-2148-10-244-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/787717909bb4/1471-2148-10-244-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/3319e92014ca/1471-2148-10-244-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/8dbb203ffa07/1471-2148-10-244-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/7a54b42d5926/1471-2148-10-244-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/642fdd3d7237/1471-2148-10-244-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/cb6f3eae0417/1471-2148-10-244-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/a61b8aa7abed/1471-2148-10-244-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/b9ab1dd050ba/1471-2148-10-244-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/0c45e033f80b/1471-2148-10-244-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/4dc1b9a372fa/1471-2148-10-244-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/787717909bb4/1471-2148-10-244-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/3319e92014ca/1471-2148-10-244-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c0/2928787/8dbb203ffa07/1471-2148-10-244-10.jpg

相似文献

1
Modelling the ancestral sequence distribution and model frequencies in context-dependent models for primate non-coding sequences.对灵长类非编码序列的上下文相关模型中的祖先序列分布和模型频率进行建模。
BMC Evol Biol. 2010 Aug 10;10:244. doi: 10.1186/1471-2148-10-244.
2
Efficient context-dependent model building based on clustering posterior distributions for non-coding sequences.基于非编码序列聚类后验分布的高效上下文相关模型构建。
BMC Evol Biol. 2009 Apr 30;9:87. doi: 10.1186/1471-2148-9-87.
3
A model-based approach to study nearest-neighbor influences reveals complex substitution patterns in non-coding sequences.一种基于模型的方法用于研究最近邻影响,揭示了非编码序列中复杂的替代模式。
Syst Biol. 2008 Oct;57(5):675-92. doi: 10.1080/10635150802422324.
4
Using non-reversible context-dependent evolutionary models to study substitution patterns in primate non-coding sequences.使用非可逆、上下文相关的进化模型来研究灵长类非编码序列中的替换模式。
J Mol Evol. 2010 Jul;71(1):34-50. doi: 10.1007/s00239-010-9362-y. Epub 2010 Jul 11.
5
Empirical and hierarchical Bayesian estimation of ancestral states.祖先状态的经验贝叶斯估计和分层贝叶斯估计。
Syst Biol. 2001 Jun;50(3):351-66.
6
Modelling heterotachy in phylogenetic inference by reversible-jump Markov chain Monte Carlo.通过可逆跳跃马尔可夫链蒙特卡罗方法在系统发育推断中对异速进行建模。
Philos Trans R Soc Lond B Biol Sci. 2008 Dec 27;363(1512):3955-64. doi: 10.1098/rstb.2008.0178.
7
Patterns of co-speciation and host switching in primate malaria parasites.灵长类疟原虫的共物种形成和宿主转换模式。
Malar J. 2009 May 22;8:110. doi: 10.1186/1475-2875-8-110.
8
A reversible jump method for Bayesian phylogenetic inference with a nonhomogeneous substitution model.一种用于具有非齐次替换模型的贝叶斯系统发育推断的可逆跳跃方法。
Mol Biol Evol. 2007 Jun;24(6):1286-99. doi: 10.1093/molbev/msm046. Epub 2007 Mar 7.
9
Bayesian estimation of ancestral character states on phylogenies.系统发育树上祖先性状状态的贝叶斯估计。
Syst Biol. 2004 Oct;53(5):673-84. doi: 10.1080/10635150490522232.
10
The Effect of Nonreversibility on Inferring Rooted Phylogenies.非可逆性对推断有根系统发育的影响。
Mol Biol Evol. 2018 Apr 1;35(4):984-1002. doi: 10.1093/molbev/msx294.

引用本文的文献

1
Enabling Inference for Context-Dependent Models of Mutation by Bounding the Propagation of Dependency.通过限制依赖性的传播来实现依赖上下文的突变模型的推理。
J Comput Biol. 2022 Aug;29(8):802-824. doi: 10.1089/cmb.2021.0644. Epub 2022 Jul 1.
2
Asymmetric Context-Dependent Mutation Patterns Revealed through Mutation-Accumulation Experiments.通过突变积累实验揭示的不对称上下文依赖突变模式
Mol Biol Evol. 2015 Jul;32(7):1672-83. doi: 10.1093/molbev/msv055. Epub 2015 Mar 6.
3
Neighbor preferences of amino acids and context-dependent effects of amino acid substitutions in human, mouse, and dog.

本文引用的文献

1
Mutation-selection models of coding sequence evolution with site-heterogeneous amino acid fitness profiles.带有位置异质氨基酸适合度分布的编码序列进化的突变-选择模型。
Proc Natl Acad Sci U S A. 2010 Mar 9;107(10):4629-34. doi: 10.1073/pnas.0910915107. Epub 2010 Feb 22.
2
Rapid likelihood analysis on large phylogenies using partial sampling of substitution histories.利用替代历史的部分抽样对大型系统发育树进行快速似然分析。
Mol Biol Evol. 2010 Feb;27(2):249-65. doi: 10.1093/molbev/msp228. Epub 2009 Sep 25.
3
Efficient context-dependent model building based on clustering posterior distributions for non-coding sequences.
人类、小鼠和犬类中氨基酸的邻位偏好及氨基酸替换的上下文依赖性效应
Int J Mol Sci. 2014 Sep 10;15(9):15963-80. doi: 10.3390/ijms150915963.
4
Make the most of your samples: Bayes factor estimators for high-dimensional models of sequence evolution.充分利用你的样本:序列进化高维模型的贝叶斯因子估计量。
BMC Bioinformatics. 2013 Mar 6;14:85. doi: 10.1186/1471-2105-14-85.
5
Context-dependent codon partition models provide significant increases in model fit in atpB and rbcL protein-coding genes.上下文相关密码子分区模型在 atpB 和 rbcL 蛋白质编码基因中提供了显著提高模型拟合度的效果。
BMC Evol Biol. 2011 May 27;11:145. doi: 10.1186/1471-2148-11-145.
6
Coordinated genome-wide modifications within proximal promoter cis-regulatory elements during vertebrate evolution.脊椎动物进化过程中近端启动子顺式调控元件内全基因组协调修饰。
Genome Biol Evol. 2011;3:66-74. doi: 10.1093/gbe/evq078. Epub 2010 Nov 30.
7
Using non-reversible context-dependent evolutionary models to study substitution patterns in primate non-coding sequences.使用非可逆、上下文相关的进化模型来研究灵长类非编码序列中的替换模式。
J Mol Evol. 2010 Jul;71(1):34-50. doi: 10.1007/s00239-010-9362-y. Epub 2010 Jul 11.
基于非编码序列聚类后验分布的高效上下文相关模型构建。
BMC Evol Biol. 2009 Apr 30;9:87. doi: 10.1186/1471-2148-9-87.
4
A model-based approach to study nearest-neighbor influences reveals complex substitution patterns in non-coding sequences.一种基于模型的方法用于研究最近邻影响,揭示了非编码序列中复杂的替代模式。
Syst Biol. 2008 Oct;57(5):675-92. doi: 10.1080/10635150802422324.
5
Mutation-selection models of codon substitution and their use to estimate selective strengths on codon usage.密码子替换的突变选择模型及其在估计密码子使用选择强度方面的应用。
Mol Biol Evol. 2008 Mar;25(3):568-79. doi: 10.1093/molbev/msm284. Epub 2008 Jan 3.
6
Assessing site-interdependent phylogenetic models of sequence evolution.评估序列进化的位点依赖系统发育模型。
Mol Biol Evol. 2006 Sep;23(9):1762-75. doi: 10.1093/molbev/msl041. Epub 2006 Jun 20.
7
Computing Bayes factors using thermodynamic integration.使用热力学积分计算贝叶斯因子。
Syst Biol. 2006 Apr;55(2):195-207. doi: 10.1080/10635150500433722.
8
Differences between pair-wise and multi-sequence alignment methods affect vertebrate genome comparisons.两两比对和多序列比对方法之间的差异影响脊椎动物基因组比较。
Trends Genet. 2006 Apr;22(4):187-93. doi: 10.1016/j.tig.2006.02.005. Epub 2006 Feb 24.
9
Should phylogenetic models be trying to "fit an elephant"?系统发育模型应该尝试“拟合一头大象”吗?
Trends Genet. 2005 Jun;21(6):307-9. doi: 10.1016/j.tig.2005.04.001.
10
Model parameterization, prior distributions, and the general time-reversible model in Bayesian phylogenetics.贝叶斯系统发育学中的模型参数化、先验分布和一般时间可逆模型。
Syst Biol. 2004 Dec;53(6):877-88. doi: 10.1080/10635150490522584.