• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对种内进化率和时间尺度的系统发育估计中,模型站点间变异率异质性的影响。

The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.

机构信息

School of Biological Sciences, University of Sydney, Sydney, New South Wales, Australia.

出版信息

PLoS One. 2014 May 5;9(5):e95722. doi: 10.1371/journal.pone.0095722. eCollection 2014.

DOI:10.1371/journal.pone.0095722
PMID:24798481
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4010409/
Abstract

Phylogenetic analyses of DNA sequence data can provide estimates of evolutionary rates and timescales. Nearly all phylogenetic methods rely on accurate models of nucleotide substitution. A key feature of molecular evolution is the heterogeneity of substitution rates among sites, which is often modelled using a discrete gamma distribution. A widely used derivative of this is the gamma-invariable mixture model, which assumes that a proportion of sites in the sequence are completely resistant to change, while substitution rates at the remaining sites are gamma-distributed. For data sampled at the intraspecific level, however, biological assumptions involved in the invariable-sites model are commonly violated. We examined the use of these models in analyses of five intraspecific data sets. We show that using 6-10 rate categories for the discrete gamma distribution of rates among sites is sufficient to provide a good approximation of the marginal likelihood. Increasing the number of gamma rate categories did not have a substantial effect on estimates of the substitution rate or coalescence time, unless rates varied strongly among sites in a non-gamma-distributed manner. The assumption of a proportion of invariable sites provided a better approximation of the asymptotic marginal likelihood when the number of gamma categories was small, but had minimal impact on estimates of rates and coalescence times. However, the estimated proportion of invariable sites was highly susceptible to changes in the number of gamma rate categories. The concurrent use of gamma and invariable-site models for intraspecific data is not biologically meaningful and has been challenged on statistical grounds; here we have found that the assumption of a proportion of invariable sites has no obvious impact on Bayesian estimates of rates and timescales from intraspecific data.

摘要

基于 DNA 序列数据的系统发育分析可以提供进化率和时间尺度的估计。几乎所有的系统发育方法都依赖于核苷酸替代的准确模型。分子进化的一个关键特征是位点替代率的异质性,这通常使用离散的伽马分布来建模。这种分布的一个广泛应用的衍生模型是伽马不变混合模型,该模型假设序列中存在一部分位点完全不能发生变化,而其余位点的替代率呈伽马分布。然而,对于在种内水平采样的数据,不变位点模型所涉及的生物学假设通常是违反的。我们检验了这些模型在五个种内数据集分析中的应用。结果表明,对于位点之间的离散伽马分布速率,使用 6-10 个速率类别足以很好地逼近边际似然。除非速率以非伽马分布的方式在不同位点之间有强烈变化,否则增加伽马速率类别的数量对替代率或聚和时间的估计不会有实质性的影响。假设存在一部分不变位点,当伽马类别数量较小时,能更好地逼近渐近边际似然,但对替代率和聚和时间的估计影响很小。然而,不变位点的估计比例非常容易受到伽马速率类别的数量变化的影响。对于种内数据同时使用伽马和不变位点模型在生物学上没有意义,并在统计上受到挑战;在这里,我们发现,假设存在一部分不变位点,对种内数据的贝叶斯估计替代率和时间尺度没有明显的影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee6/4010409/2a4380ecf688/pone.0095722.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee6/4010409/4dae58ebd836/pone.0095722.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee6/4010409/0579326c1db1/pone.0095722.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee6/4010409/a23aa40c5056/pone.0095722.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee6/4010409/2a4380ecf688/pone.0095722.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee6/4010409/4dae58ebd836/pone.0095722.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee6/4010409/0579326c1db1/pone.0095722.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee6/4010409/a23aa40c5056/pone.0095722.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee6/4010409/2a4380ecf688/pone.0095722.g004.jpg

相似文献

1
The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.对种内进化率和时间尺度的系统发育估计中,模型站点间变异率异质性的影响。
PLoS One. 2014 May 5;9(5):e95722. doi: 10.1371/journal.pone.0095722. eCollection 2014.
2
Exploring among-site rate variation models in a maximum likelihood framework using empirical data: effects of model assumptions on estimates of topology, branch lengths, and bootstrap support.在最大似然框架下使用经验数据探索位点间速率变化模型:模型假设对拓扑结构、分支长度和自展支持度估计的影响。
Syst Biol. 2001 Feb;50(1):67-86.
3
Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site Plus Gamma Model.复杂的序列进化模型需要精确的估计器,不变位点加伽马模型就是一个例子。
Syst Biol. 2018 May 1;67(3):552-558. doi: 10.1093/sysbio/syx092.
4
Estimation of rates-across-sites distributions in phylogenetic substitution models.系统发育替代模型中跨位点速率分布的估计。
Syst Biol. 2003 Oct;52(5):594-603. doi: 10.1080/10635150390235395.
5
Mixture models of nucleotide sequence evolution that account for heterogeneity in the substitution process across sites and across lineages.能够解释核苷酸序列进化过程中替换过程在各位置和各谱系间异质性的混合模型。
Syst Biol. 2014 Sep;63(5):726-42. doi: 10.1093/sysbio/syu036. Epub 2014 Jun 12.
6
Maximum likelihood estimation of phylogenetic trees is consistent when substitution rates vary according to the invariable sites plus gamma distribution.当替换率根据不变位点加伽马分布而变化时,系统发育树的最大似然估计是一致的。
Syst Biol. 2001 Sep-Oct;50(5):713-22. doi: 10.1080/106351501753328839.
7
A class frequency mixture model that adjusts for site-specific amino acid frequencies and improves inference of protein phylogeny.一种根据特定位点氨基酸频率进行调整并改进蛋白质系统发育推断的类频率混合模型。
BMC Evol Biol. 2008 Dec 16;8:331. doi: 10.1186/1471-2148-8-331.
8
Identifiability of the unrooted species tree topology under the coalescent model with time-reversible substitution processes, site-specific rate variation, and invariable sites.在具有时间可逆替换过程、位点特异性速率变异和不变位点的合并模型下无根物种树拓扑结构的可识别性。
J Theor Biol. 2015 Jun 7;374:35-47. doi: 10.1016/j.jtbi.2015.03.006. Epub 2015 Mar 17.
9
Impact of taxon sampling on the estimation of rates of evolution at sites.分类群抽样对位点进化速率估计的影响。
Mol Biol Evol. 2005 Mar;22(3):784-91. doi: 10.1093/molbev/msi065. Epub 2004 Dec 8.
10
Composite likelihood modeling of neighboring site correlations of DNA sequence substitution rates.DNA序列替换率相邻位点相关性的复合似然建模
Stat Appl Genet Mol Biol. 2009;8:Article 6. doi: 10.2202/1544-6115.1391. Epub 2009 Jan 28.

引用本文的文献

1
Molecular epidemiology of the HIV-1 epidemic in Fiji.斐济HIV-1流行的分子流行病学
Npj Viruses. 2024 Mar 6;2(1):8. doi: 10.1038/s44298-024-00019-3.
2
Toward a Semi-Supervised Learning Approach to Phylogenetic Estimation.迈向基于半监督学习的系统发育估计方法。
Syst Biol. 2024 Oct 30;73(5):789-806. doi: 10.1093/sysbio/syae029.
3
Many-core algorithms for high-dimensional gradients on phylogenetic trees.用于系统发育树上高维梯度的多核算法。

本文引用的文献

1
Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans.新石器时代线粒体单倍群 H 基因组与欧洲人的遗传起源。
Nat Commun. 2013;4:1764. doi: 10.1038/ncomms2656.
2
A revised timescale for human evolution based on ancient mitochondrial genomes.基于古代线粒体基因组的人类进化修订时间表。
Curr Biol. 2013 Apr 8;23(7):553-559. doi: 10.1016/j.cub.2013.02.044. Epub 2013 Mar 21.
3
Arrival of Paleo-Indians to the southern cone of South America: new clues from mitogenomes.古印第安人到达南美洲南部锥体地区:来自线粒体基因组的新线索。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae030.
4
Species Delimitation in a Polyploid Group of Iberian (Campanulaceae) Unveils Coherence between Cryptic Speciation and Biogeographical Regionalization.伊比利亚多倍体组(桔梗科)的物种界定揭示了隐存物种形成与生物地理区域化之间的一致性。
Plants (Basel). 2023 Dec 15;12(24):4176. doi: 10.3390/plants12244176.
5
Variation in the Substitution Rates among the Human Mitochondrial Haplogroup U Sublineages.人类线粒体单倍群 U 亚支系替代率的变化。
Genome Biol Evol. 2022 Jul 2;14(7). doi: 10.1093/gbe/evac097.
6
Population genomics of Drosophila suzukii reveal longitudinal population structure and signals of migrations in and out of the continental United States.黑腹果蝇的群体基因组学揭示了其在进入和离开美国大陆过程中的纵向种群结构和迁移信号。
G3 (Bethesda). 2021 Dec 8;11(12). doi: 10.1093/g3journal/jkab343.
7
Diversity and Paleodemography of the Addax (), a Saharan Antelope on the Verge of Extinction.叉角羚()的多样性和古人口统计学研究,这种撒哈拉羚羊正处于灭绝的边缘。
Genes (Basel). 2021 Aug 11;12(8):1236. doi: 10.3390/genes12081236.
8
Spectral neighbor joining for reconstruction of latent tree Models.用于潜在树模型重建的谱邻接合并
SIAM J Math Data Sci. 2021;3(1):113-141. doi: 10.1137/20m1365715. Epub 2021 Feb 1.
9
Molecular Evolution of Human Norovirus GII.2 Clusters.人诺如病毒GII.2簇的分子进化
Front Microbiol. 2021 Mar 22;12:655567. doi: 10.3389/fmicb.2021.655567. eCollection 2021.
10
Cultivable marine fungi from the Arctic Archipelago of Svalbard and their antibacterial activity.来自斯瓦尔巴德群岛北极群岛的可培养海洋真菌及其抗菌活性。
Mycology. 2019 Dec 27;11(3):230-242. doi: 10.1080/21501203.2019.1708492.
PLoS One. 2012;7(12):e51311. doi: 10.1371/journal.pone.0051311. Epub 2012 Dec 11.
4
Bayesian selection of nucleotide substitution models and their site assignments.贝叶斯选择核苷酸替换模型及其位点分配。
Mol Biol Evol. 2013 Mar;30(3):669-88. doi: 10.1093/molbev/mss258. Epub 2012 Dec 11.
5
The influence of rate heterogeneity among sites on the time dependence of molecular rates.位点间率异质性对分子率时间依赖性的影响。
Mol Biol Evol. 2012 Nov;29(11):3345-58. doi: 10.1093/molbev/mss140. Epub 2012 May 21.
6
Bayesian phylogenetics with BEAUti and the BEAST 1.7.贝叶斯系统发育学与 BEAUTi 和 BEAST 1.7。
Mol Biol Evol. 2012 Aug;29(8):1969-73. doi: 10.1093/molbev/mss075. Epub 2012 Feb 25.
7
Large scale mitochondrial sequencing in Mexican Americans suggests a reappraisal of Native American origins.在墨西哥裔美国人中进行大规模的线粒体测序表明,有必要重新评估美洲原住民的起源。
BMC Evol Biol. 2011 Oct 7;11:293. doi: 10.1186/1471-2148-11-293.
8
Variation in the mutation rate across mammalian genomes.哺乳动物基因组中突变率的变化。
Nat Rev Genet. 2011 Oct 4;12(11):756-66. doi: 10.1038/nrg3098.
9
Bayesian estimation of substitution rates from ancient DNA sequences with low information content.利用低信息含量的古代DNA序列进行替代率的贝叶斯估计。
Syst Biol. 2011 May;60(3):366-75. doi: 10.1093/sysbio/syq099. Epub 2011 Feb 4.
10
Among-site rate variation and its impact on phylogenetic analyses.种间变异率及其对系统发育分析的影响。
Trends Ecol Evol. 1996 Sep;11(9):367-72. doi: 10.1016/0169-5347(96)10041-0.