• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在系统发育质心构建有意义的进化平均值。

Constructing a meaningful evolutionary average at the phylogenetic center of mass.

作者信息

Stone Eric A, Sidow Arend

机构信息

Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695-7566, USA.

出版信息

BMC Bioinformatics. 2007 Jun 26;8:222. doi: 10.1186/1471-2105-8-222.

DOI:10.1186/1471-2105-8-222
PMID:17594490
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1919398/
Abstract

BACKGROUND

As a consequence of the evolutionary process, data collected from related species tend to be similar. This similarity by descent can obscure subtler signals in the data such as the evidence of constraint on variation due to shared selective pressures. In comparative sequence analysis, for example, sequence similarity is often used to illuminate important regions of the genome, but if the comparison is between closely related species, then similarity is the rule rather than the interesting exception. Furthermore, and perhaps worse yet, the contribution of a divergent third species may be masked by the strong similarity between the other two. Here we propose a remedy that weighs the contribution of each species according to its phylogenetic placement.

RESULTS

We first solve the problem of summarizing data related by phylogeny, and we explain why an average should operate on the entire evolutionary trajectory that relates the data. This perspective leads to a new approach in which we define the average in terms of the phylogeny, using the data and a stochastic model to obtain a probability on evolutionary trajectories. With the assumption that the data evolve according to a Brownian motion process on the tree, we show that our evolutionary average can be computed as convex combination of the species data. Thus, our approach, called the BranchManager, defines both an average and a novel taxon weighting scheme. We compare the BranchManager to two other methods, demonstrating why it exhibits desirable properties. In doing so, we devise a framework for comparison and introduce the concept of a representative point at which the average is situated.

CONCLUSION

The BranchManager uses as its representative point the phylogenetic center of mass, a choice which has both intuitive and practical appeal. Because our average is intrinsic to both the dataset and to the phylogeny, we expect it and its corresponding weighting scheme to be useful in all sorts of studies where interspecies data need to be combined. Obvious applications include evolutionary studies of morphology, physiology or behaviour, but quantitative measures such as sequence hydrophobicity and gene expression level are amenable to our approach as well. Other areas of potential impact include motif discovery and vaccine design. A Java implementation of the BranchManager is available for download, as is a script written in the statistical language R.

摘要

背景

作为进化过程的结果,从相关物种收集的数据往往具有相似性。这种因亲缘关系而产生的相似性可能会掩盖数据中更细微的信号,比如由于共享选择压力而对变异产生限制的证据。例如,在比较序列分析中,序列相似性常被用于揭示基因组的重要区域,但如果比较是在亲缘关系较近的物种之间进行,那么相似性就是常态而非有趣的例外情况。此外,或许更糟糕的是,第三个分歧物种的贡献可能会被另外两个物种之间的强烈相似性所掩盖。在此,我们提出一种补救方法,即根据每个物种的系统发育位置来权衡其贡献。

结果

我们首先解决了总结系统发育相关数据的问题,并解释了为什么平均值应该作用于与数据相关的整个进化轨迹。这种观点引出了一种新方法,在该方法中,我们根据系统发育来定义平均值,利用数据和一个随机模型来获得进化轨迹上的概率。假设数据在树上按照布朗运动过程进化,我们表明我们的进化平均值可以作为物种数据的凸组合来计算。因此,我们的方法,即分支管理器(BranchManager),既定义了一个平均值,也定义了一种新颖的分类单元加权方案。我们将分支管理器与其他两种方法进行比较,展示了它为何具有理想的特性。在此过程中,我们设计了一个比较框架,并引入了平均值所在的代表点的概念。

结论

分支管理器以系统发育质心作为其代表点,这一选择兼具直观性和实用性。由于我们的平均值对于数据集和系统发育都是内在的,我们期望它及其相应的加权方案在所有需要组合种间数据的各类研究中都有用。明显的应用包括形态学、生理学或行为的进化研究,但诸如序列疏水性和基因表达水平等定量测量也适用于我们的方法。其他潜在影响领域包括基序发现和疫苗设计。分支管理器的Java实现版本可供下载,还有一个用统计语言R编写的脚本也可供下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/8eafcc98989b/1471-2105-8-222-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/e0c5f37dacaa/1471-2105-8-222-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/bf1eb5e1825f/1471-2105-8-222-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/f4037a633b0b/1471-2105-8-222-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/4ef106624ed7/1471-2105-8-222-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/2e51298a5fbb/1471-2105-8-222-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/8eafcc98989b/1471-2105-8-222-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/e0c5f37dacaa/1471-2105-8-222-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/bf1eb5e1825f/1471-2105-8-222-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/f4037a633b0b/1471-2105-8-222-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/4ef106624ed7/1471-2105-8-222-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/2e51298a5fbb/1471-2105-8-222-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7680/1919398/8eafcc98989b/1471-2105-8-222-6.jpg

相似文献

1
Constructing a meaningful evolutionary average at the phylogenetic center of mass.在系统发育质心构建有意义的进化平均值。
BMC Bioinformatics. 2007 Jun 26;8:222. doi: 10.1186/1471-2105-8-222.
2
Automatic genome-wide reconstruction of phylogenetic gene trees.系统发育基因树的全基因组自动重建
Bioinformatics. 2007 Jul 1;23(13):i549-58. doi: 10.1093/bioinformatics/btm193.
3
CSMET: comparative genomic motif detection via multi-resolution phylogenetic shadowing.CSMET:通过多分辨率系统发育影子进行比较基因组基序检测
PLoS Comput Biol. 2008 Jun 6;4(6):e1000090. doi: 10.1371/journal.pcbi.1000090.
4
On universal common ancestry, sequence similarity, and phylogenetic structure: the sins of P-values and the virtues of Bayesian evidence.在普遍共同祖先、序列相似性和系统发育结构方面:P 值之过与贝叶斯证据之德。
Biol Direct. 2011 Nov 24;6(1):60. doi: 10.1186/1745-6150-6-60.
5
Evolution of the genomic rate of recombination in mammals.哺乳动物基因组重组率的演变。
Evolution. 2008 Feb;62(2):276-94. doi: 10.1111/j.1558-5646.2007.00278.x. Epub 2007 Dec 6.
6
Phylogenetic trees based on gene content.基于基因含量的系统发育树。
Bioinformatics. 2004 Sep 1;20(13):2044-9. doi: 10.1093/bioinformatics/bth198. Epub 2004 Mar 25.
7
GeneContent: software for whole-genome phylogenetic analysis.基因内容:用于全基因组系统发育分析的软件。
Bioinformatics. 2005 Apr 15;21(8):1713-4. doi: 10.1093/bioinformatics/bti208. Epub 2004 Dec 14.
8
Maximum likelihood of phylogenetic networks.系统发育网络的最大似然法
Bioinformatics. 2006 Nov 1;22(21):2604-11. doi: 10.1093/bioinformatics/btl452. Epub 2006 Aug 23.
9
Bayesian coestimation of phylogeny and sequence alignment.系统发育与序列比对的贝叶斯联合估计
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.
10
MATICCE: mapping transitions in continuous character evolution.MATICCE:连续字符演化中的转变映射。
Bioinformatics. 2010 Jan 1;26(1):132-3. doi: 10.1093/bioinformatics/btp625. Epub 2009 Oct 30.

引用本文的文献

1
Patterns of sexual dimorphism in the armoured tardigrades.有甲迟缓动物雌雄二形性的模式。
Biol Lett. 2024 Sep;20(9):20240301. doi: 10.1098/rsbl.2024.0301. Epub 2024 Sep 11.
2
Improved prediction of site-rates from structure with averaging across homologs.通过同源物平均化提高从结构预测位点速率。
Protein Sci. 2024 Jul;33(7):e5086. doi: 10.1002/pro.5086.
3
Cancer Is Associated with Alterations in the Three-Dimensional Organization of the Genome.癌症与基因组三维组织结构的改变有关。

本文引用的文献

1
Revisiting a Key Innovation in Evolutionary Biology: Felsenstein's "Phylogenies and the Comparative Method".重温进化生物学的一个关键创新:费雪斯坦的“系统发育与比较方法”。
Am Nat. 2019 Jun;193(6):755-772. doi: 10.1086/703055. Epub 2019 Apr 23.
2
The relative inefficiency of sequence weights approaches in determining a nucleotide position weight matrix.序列权重方法在确定核苷酸位置权重矩阵方面相对低效。
Stat Appl Genet Mol Biol. 2005;4:Article13. doi: 10.2202/1544-6115.1135. Epub 2005 Jun 1.
3
Structure of a V3-containing HIV-1 gp120 core.
Cancers (Basel). 2019 Nov 27;11(12):1886. doi: 10.3390/cancers11121886.
4
Phylogenetic weighting does little to improve the accuracy of evolutionary coupling analyses.系统发育加权对提高进化偶联分析的准确性作用不大。
Entropy (Basel). 2019 Oct;21(10). doi: 10.3390/e21101000. Epub 2019 Oct 12.
5
Predicting evolutionary site variability from structure in viral proteins: buriedness, packing, flexibility, and design.从病毒蛋白结构预测进化位点变异性:埋藏性、堆积、灵活性与设计
J Mol Evol. 2014 Oct;79(3-4):130-42. doi: 10.1007/s00239-014-9644-x. Epub 2014 Sep 13.
6
Limited utility of residue masking for positive-selection inference.残基屏蔽在正选择推断中的效用有限。
Mol Biol Evol. 2014 Sep;31(9):2496-500. doi: 10.1093/molbev/msu183. Epub 2014 Jun 3.
7
The impact of RNA structure on coding sequence evolution in both bacteria and eukaryotes.RNA 结构对细菌和真核生物中编码序列进化的影响。
BMC Evol Biol. 2014 Apr 23;14:87. doi: 10.1186/1471-2148-14-87.
8
Quantitative analysis of the Drosophila segmentation regulatory network using pattern generating potentials.使用模式生成潜力对果蝇分割调控网络进行定量分析。
PLoS Biol. 2010 Aug 17;8(8):e1000456. doi: 10.1371/journal.pbio.1000456.
9
Direct measure of the de novo mutation rate in autism and schizophrenia cohorts.直接测量自闭症和精神分裂症队列中的从头突变率。
Am J Hum Genet. 2010 Sep 10;87(3):316-24. doi: 10.1016/j.ajhg.2010.07.019.
10
ProPhylER: a curated online resource for protein function and structure based on evolutionary constraint analyses.ProPhylER:一个基于进化约束分析的蛋白质功能和结构的精选在线资源。
Genome Res. 2010 Jan;20(1):142-54. doi: 10.1101/gr.097121.109. Epub 2009 Oct 21.
含V3区的HIV-1 gp120核心结构。
Science. 2005 Nov 11;310(5750):1025-8. doi: 10.1126/science.1118398.
4
The inference of protein-protein interactions by co-evolutionary analysis is improved by excluding the information about the phylogenetic relationships.通过排除系统发育关系的信息,利用共进化分析推断蛋白质-蛋白质相互作用的方法得到了改进。
Bioinformatics. 2005 Sep 1;21(17):3482-9. doi: 10.1093/bioinformatics/bti564. Epub 2005 Jun 30.
5
WebLogo: a sequence logo generator.WebLogo:一个序列图生成器。
Genome Res. 2004 Jun;14(6):1188-90. doi: 10.1101/gr.849004.
6
ANALYSIS OF HUMAN EVOLUTION UNDER RANDOM GENETIC DRIFT.随机遗传漂变下的人类进化分析
Cold Spring Harb Symp Quant Biol. 1964;29:9-20. doi: 10.1101/sqb.1964.029.01.006.
7
Consensus and ancestral state HIV vaccines.共识性和祖传状态HIV疫苗。
Science. 2003 Mar 7;299(5612):1515-8; author reply 1515-8. doi: 10.1126/science.299.5612.1515c.
8
Diversity considerations in HIV-1 vaccine selection.HIV-1疫苗选择中的多样性考量
Science. 2002 Jun 28;296(5577):2354-60. doi: 10.1126/science.1070441.
9
Modeling residue usage in aligned protein sequences via maximum likelihood.通过最大似然法对比对后的蛋白质序列中的残基使用情况进行建模。
Mol Biol Evol. 1996 Dec;13(10):1368-74. doi: 10.1093/oxfordjournals.molbev.a025583.
10
A weighting system and algorithm for aligning many phylogenetically related sequences.一种用于比对多个系统发育相关序列的加权系统和算法。
Comput Appl Biosci. 1995 Oct;11(5):543-51. doi: 10.1093/bioinformatics/11.5.543.