• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于模型的方法用于研究最近邻影响,揭示了非编码序列中复杂的替代模式。

A model-based approach to study nearest-neighbor influences reveals complex substitution patterns in non-coding sequences.

作者信息

Baele Guy, Van de Peer Yves, Vansteelandt Stijn

机构信息

Department of Applied Mathematics and Computer Science, Ghent University, Ghent, Belgium.

出版信息

Syst Biol. 2008 Oct;57(5):675-92. doi: 10.1080/10635150802422324.

DOI:10.1080/10635150802422324
PMID:18853356
Abstract

In this article, we present a likelihood-based framework for modeling site dependencies. Our approach builds upon standard evolutionary models but incorporates site dependencies across the entire tree by letting the evolutionary parameters in these models depend upon the ancestral states at the neighboring sites. It thus avoids the need for introducing new and high-dimensional evolutionary models for site-dependent evolution. We propose a Markov chain Monte Carlo approach with data augmentation to infer the evolutionary parameters under our model. Although our approach allows for wide-ranging site dependencies, we illustrate its use, in two non-coding datasets, in the case of nearest-neighbor dependencies (i.e., evolution directly depending only upon the immediate flanking sites). The results reveal that the general time-reversible model with nearest-neighbor dependencies substantially improves the fit to the data as compared to the corresponding model with site independence. Using the parameter estimates from our model, we elaborate on the importance of the 5-methylcytosine deamination process (i.e., the CpG effect) and show that this process also depends upon the 5' neighboring base identity. We hint at the possibility of a so-called TpA effect and show that the observed substitution behavior is very complex in the light of dinucleotide estimates. We also discuss the presence of CpG effects in a nuclear small subunit dataset and find significant evidence that evolutionary models incorporating context-dependent effects perform substantially better than independent-site models and in some cases even outperform models that incorporate varying rates across sites.

摘要

在本文中,我们提出了一个基于似然性的框架来对位点依赖性进行建模。我们的方法建立在标准进化模型的基础之上,但通过让这些模型中的进化参数取决于相邻位点的祖先状态,将位点依赖性纳入到整个树中。因此,它避免了为位点依赖性进化引入新的高维进化模型的必要性。我们提出了一种带有数据增强的马尔可夫链蒙特卡罗方法,以推断我们模型下的进化参数。尽管我们的方法允许广泛的位点依赖性,但我们在两个非编码数据集的案例中,展示了其在最近邻依赖性(即进化直接仅取决于紧邻侧翼位点)情况下的应用。结果表明,与具有位点独立性的相应模型相比,具有最近邻依赖性的一般时间可逆模型显著改善了对数据的拟合。利用我们模型的参数估计,我们详细阐述了5 - 甲基胞嘧啶脱氨过程(即CpG效应)的重要性,并表明该过程还取决于5' 相邻碱基的身份。我们暗示了所谓的TpA效应的可能性,并表明根据二核苷酸估计,观察到的替换行为非常复杂。我们还讨论了核小亚基数据集中CpG效应的存在,并发现有重要证据表明,纳入上下文依赖效应的进化模型比独立位点模型表现得更好,在某些情况下甚至优于纳入位点间不同速率的模型。

相似文献

1
A model-based approach to study nearest-neighbor influences reveals complex substitution patterns in non-coding sequences.一种基于模型的方法用于研究最近邻影响,揭示了非编码序列中复杂的替代模式。
Syst Biol. 2008 Oct;57(5):675-92. doi: 10.1080/10635150802422324.
2
Continuous and tractable models for the variation of evolutionary rates.进化速率变化的连续且易于处理的模型。
Math Biosci. 2006 Feb;199(2):216-33. doi: 10.1016/j.mbs.2005.11.002. Epub 2006 Jan 10.
3
Evolutionary model selection with a genetic algorithm: a case study using stem RNA.基于遗传算法的进化模型选择:以茎RNA为例的案例研究
Mol Biol Evol. 2007 Jan;24(1):159-70. doi: 10.1093/molbev/msl144. Epub 2006 Oct 12.
4
Reconstruction of ancestral nucleotide sequences and estimation of substitution frequencies in a star phylogeny.星状系统发育树中祖先核苷酸序列的重建及替换频率的估计。
Gene. 2007 Apr 1;390(1-2):75-83. doi: 10.1016/j.gene.2006.11.022. Epub 2006 Dec 14.
5
An evolutionary space-time model with varying among-site dependencies.
Mol Biol Evol. 2006 Feb;23(2):392-400. doi: 10.1093/molbev/msj044. Epub 2005 Nov 2.
6
Inferring complex DNA substitution processes on phylogenies using uniformization and data augmentation.利用均匀化和数据增强在系统发育树上推断复杂的DNA替代过程。
Syst Biol. 2006 Apr;55(2):259-69. doi: 10.1080/10635150500541599.
7
Computational methods for evaluating phylogenetic models of coding sequence evolution with dependence between codons.用于评估密码子间存在依赖性的编码序列进化系统发育模型的计算方法。
Mol Biol Evol. 2009 Jul;26(7):1663-76. doi: 10.1093/molbev/msp078. Epub 2009 Apr 21.
8
A reversible jump method for Bayesian phylogenetic inference with a nonhomogeneous substitution model.一种用于具有非齐次替换模型的贝叶斯系统发育推断的可逆跳跃方法。
Mol Biol Evol. 2007 Jun;24(6):1286-99. doi: 10.1093/molbev/msm046. Epub 2007 Mar 7.
9
Taking variation of evolutionary rates between sites into account in inferring phylogenies.在推断系统发育时考虑位点间进化速率的变化。
J Mol Evol. 2001 Oct-Nov;53(4-5):447-55. doi: 10.1007/s002390010234.
10
An evolutionary model for protein-coding regions with conserved RNA structure.具有保守RNA结构的蛋白质编码区域的进化模型。
Mol Biol Evol. 2004 Oct;21(10):1913-22. doi: 10.1093/molbev/msh199. Epub 2004 Jun 30.

引用本文的文献

1
Guanine holes are prominent targets for mutation in cancer and inherited disease.鸟嘌呤碱基是癌症和遗传疾病中突变的主要靶点。
PLoS Genet. 2013;9(9):e1003816. doi: 10.1371/journal.pgen.1003816. Epub 2013 Sep 26.
2
Make the most of your samples: Bayes factor estimators for high-dimensional models of sequence evolution.充分利用你的样本:序列进化高维模型的贝叶斯因子估计量。
BMC Bioinformatics. 2013 Mar 6;14:85. doi: 10.1186/1471-2105-14-85.
3
Context-dependent codon partition models provide significant increases in model fit in atpB and rbcL protein-coding genes.
上下文相关密码子分区模型在 atpB 和 rbcL 蛋白质编码基因中提供了显著提高模型拟合度的效果。
BMC Evol Biol. 2011 May 27;11:145. doi: 10.1186/1471-2148-11-145.
4
Coordinated genome-wide modifications within proximal promoter cis-regulatory elements during vertebrate evolution.脊椎动物进化过程中近端启动子顺式调控元件内全基因组协调修饰。
Genome Biol Evol. 2011;3:66-74. doi: 10.1093/gbe/evq078. Epub 2010 Nov 30.
5
Sigma-2: Multiple sequence alignment of non-coding DNA via an evolutionary model.Sigma-2:基于进化模型的非编码 DNA 多重序列比对。
BMC Bioinformatics. 2010 Sep 16;11:464. doi: 10.1186/1471-2105-11-464.
6
Context dependent substitution biases vary within the human genome.语境相关替换偏倚在人类基因组中存在差异。
BMC Bioinformatics. 2010 Sep 15;11:462. doi: 10.1186/1471-2105-11-462.
7
Modelling the ancestral sequence distribution and model frequencies in context-dependent models for primate non-coding sequences.对灵长类非编码序列的上下文相关模型中的祖先序列分布和模型频率进行建模。
BMC Evol Biol. 2010 Aug 10;10:244. doi: 10.1186/1471-2148-10-244.
8
Using non-reversible context-dependent evolutionary models to study substitution patterns in primate non-coding sequences.使用非可逆、上下文相关的进化模型来研究灵长类非编码序列中的替换模式。
J Mol Evol. 2010 Jul;71(1):34-50. doi: 10.1007/s00239-010-9362-y. Epub 2010 Jul 11.
9
Mutation-selection models of coding sequence evolution with site-heterogeneous amino acid fitness profiles.带有位置异质氨基酸适合度分布的编码序列进化的突变-选择模型。
Proc Natl Acad Sci U S A. 2010 Mar 9;107(10):4629-34. doi: 10.1073/pnas.0910915107. Epub 2010 Feb 22.
10
COMIT: identification of noncoding motifs under selection in coding sequences.COMIT:鉴定编码序列中受选择影响的非编码基序。
Genome Biol. 2009;10(11):R133. doi: 10.1186/gb-2009-10-11-r133. Epub 2009 Nov 20.