• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

关于序列、功能和基于树的基因特性如何影响系统发育推断的全基因组规模研究。

A Genome-Scale Investigation of How Sequence, Function, and Tree-Based Gene Properties Influence Phylogenetic Inference.

作者信息

Shen Xing-Xing, Salichos Leonidas, Rokas Antonis

机构信息

Department of Biological Sciences, Vanderbilt University.

Department of Biological Sciences, Vanderbilt University Department of Molecular Biophysics and Biochemistry, Yale University.

出版信息

Genome Biol Evol. 2016 Sep 2;8(8):2565-80. doi: 10.1093/gbe/evw179.

DOI:10.1093/gbe/evw179
PMID:27492233
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5010910/
Abstract

Molecular phylogenetic inference is inherently dependent on choices in both methodology and data. Many insightful studies have shown how choices in methodology, such as the model of sequence evolution or optimality criterion used, can strongly influence inference. In contrast, much less is known about the impact of choices in the properties of the data, typically genes, on phylogenetic inference. We investigated the relationships between 52 gene properties (24 sequence-based, 19 function-based, and 9 tree-based) with each other and with three measures of phylogenetic signal in two assembled data sets of 2,832 yeast and 2,002 mammalian genes. We found that most gene properties, such as evolutionary rate (measured through the percent average of pairwise identity across taxa) and total tree length, were highly correlated with each other. Similarly, several gene properties, such as gene alignment length, Guanine-Cytosine content, and the proportion of tree distance on internal branches divided by relative composition variability (treeness/RCV), were strongly correlated with phylogenetic signal. Analysis of partial correlations between gene properties and phylogenetic signal in which gene evolutionary rate and alignment length were simultaneously controlled, showed similar patterns of correlations, albeit weaker in strength. Examination of the relative importance of each gene property on phylogenetic signal identified gene alignment length, alongside with number of parsimony-informative sites and variable sites, as the most important predictors. Interestingly, the subsets of gene properties that optimally predicted phylogenetic signal differed considerably across our three phylogenetic measures and two data sets; however, gene alignment length and RCV were consistently included as predictors of all three phylogenetic measures in both yeasts and mammals. These results suggest that a handful of sequence-based gene properties are reliable predictors of phylogenetic signal and could be useful in guiding the choice of phylogenetic markers.

摘要

分子系统发育推断本质上依赖于方法和数据方面的选择。许多有见地的研究表明,方法上的选择,比如所使用的序列进化模型或最优性标准,会对推断产生强烈影响。相比之下,对于数据(通常是基因)属性方面的选择对系统发育推断的影响则了解得少得多。我们在两个分别由2832个酵母基因和2002个哺乳动物基因组成的数据集中,研究了52种基因属性(24种基于序列的、19种基于功能的和9种基于树的)之间的相互关系,以及它们与三种系统发育信号度量之间的关系。我们发现,大多数基因属性,如进化速率(通过跨分类单元的成对同一性平均百分比来衡量)和总树长,彼此高度相关。同样,一些基因属性,如基因比对长度、鸟嘌呤 - 胞嘧啶含量,以及内部分支上的树距离除以相对组成变异性(树性/RCV)的比例,与系统发育信号强烈相关。在同时控制基因进化速率和比对长度的情况下,对基因属性和系统发育信号之间的偏相关性分析显示了类似的相关模式,尽管强度较弱。对每种基因属性对系统发育信号的相对重要性进行检验,确定基因比对长度以及简约信息位点数量和可变位点数量是最重要的预测因子。有趣的是,在我们的三种系统发育度量和两个数据集中,能最优预测系统发育信号的基因属性子集差异很大;然而,基因比对长度和RCV在酵母和哺乳动物中始终被列为所有三种系统发育度量的预测因子。这些结果表明,少数基于序列的基因属性是系统发育信号的可靠预测因子,可用于指导系统发育标记的选择。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/488505e6cf28/evw179f5p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/bdf8dcba08cc/evw179f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/bedbb1cf71cf/evw179f2p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/a178a1dcddec/evw179f3p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/cda0af3b7598/evw179f4p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/488505e6cf28/evw179f5p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/bdf8dcba08cc/evw179f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/bedbb1cf71cf/evw179f2p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/a178a1dcddec/evw179f3p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/cda0af3b7598/evw179f4p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37cb/5010910/488505e6cf28/evw179f5p.jpg

相似文献

1
A Genome-Scale Investigation of How Sequence, Function, and Tree-Based Gene Properties Influence Phylogenetic Inference.关于序列、功能和基于树的基因特性如何影响系统发育推断的全基因组规模研究。
Genome Biol Evol. 2016 Sep 2;8(8):2565-80. doi: 10.1093/gbe/evw179.
2
Less is more in mammalian phylogenomics: AT-rich genes minimize tree conflicts and unravel the root of placental mammals.在哺乳动物系统发生基因组学中,少即是多:富含 AT 的基因最小化了树冲突,并揭示了胎盘哺乳动物的起源。
Mol Biol Evol. 2013 Sep;30(9):2134-44. doi: 10.1093/molbev/mst116. Epub 2013 Jun 29.
3
Molecular systematics of terraranas (Anura: Brachycephaloidea) with an assessment of the effects of alignment and optimality criteria.陆栖蛙类(无尾目:短头蟾超科)的分子系统学及比对和最优性标准的影响评估
Zootaxa. 2014 Jun 26;3825:1-132. doi: 10.11646/zootaxa.3825.1.1.
4
Phylogenetic Tree Estimation With and Without Alignment: New Distance Methods and Benchmarking.有比对和无比对情况下的系统发育树估计:新的距离方法与基准测试
Syst Biol. 2017 Mar 1;66(2):218-231. doi: 10.1093/sysbio/syw074.
5
Multiple sequence alignment accuracy and phylogenetic inference.多序列比对准确性和系统发育推断
Syst Biol. 2006 Apr;55(2):314-28. doi: 10.1080/10635150500541730.
6
Using ESTs for phylogenomics: can one accurately infer a phylogenetic tree from a gappy alignment?利用ESTs进行系统发育基因组学研究:能否从有缺口的比对中准确推断系统发育树?
BMC Evol Biol. 2008 Mar 26;8:95. doi: 10.1186/1471-2148-8-95.
7
The tree alignment problem.树对齐问题。
BMC Bioinformatics. 2012 Nov 9;13:293. doi: 10.1186/1471-2105-13-293.
8
Phylogenetic affinity of tree shrews to Glires is attributed to fast evolution rate.树鼩与啮齿动物的系统发育亲和力归因于快速进化率。
Mol Phylogenet Evol. 2014 Feb;71:193-200. doi: 10.1016/j.ympev.2013.12.001. Epub 2013 Dec 11.
9
Evaluating the relationship between evolutionary divergence and phylogenetic accuracy in AFLP data sets.评估 AFLP 数据集内进化分歧与系统发育准确性之间的关系。
Mol Biol Evol. 2010 May;27(5):988-1000. doi: 10.1093/molbev/msp315. Epub 2009 Dec 21.
10
Evidence of Statistical Inconsistency of Phylogenetic Methods in the Presence of Multiple Sequence Alignment Uncertainty.在存在多序列比对不确定性的情况下系统发育方法统计不一致性的证据。
Genome Biol Evol. 2015 Jul 1;7(8):2102-16. doi: 10.1093/gbe/evv127.

引用本文的文献

1
Genome-wide identification and characterization of NCED gene family in soybean (Glycine max L.) and their expression profiles in response to various abiotic stress treatments.大豆(Glycine max L.)中NCED基因家族的全基因组鉴定与特征分析及其对各种非生物胁迫处理的表达谱
PLoS One. 2025 Mar 25;20(3):e0319952. doi: 10.1371/journal.pone.0319952. eCollection 2025.
2
Genome-wide identification and characterization of FORMIN gene family in cotton (Gossypium hirsutum L.) and their expression profiles in response to multiple abiotic stress treatments.棉花(陆地棉)中FORMIN基因家族的全基因组鉴定与特征分析及其对多种非生物胁迫处理的表达谱
PLoS One. 2025 Mar 3;20(3):e0319176. doi: 10.1371/journal.pone.0319176. eCollection 2025.
3

本文引用的文献

1
The impact of anchored phylogenomics and taxon sampling on phylogenetic inference in narrow-mouthed frogs (Anura, Microhylidae).锚定系统发育基因组学和分类群抽样对狭口蛙(无尾目,姬蛙科)系统发育推断的影响。
Cladistics. 2016 Apr;32(2):113-140. doi: 10.1111/cla.12118. Epub 2015 Mar 19.
2
Irrational exuberance for resolved species trees.对已解决的物种树的非理性狂热。
Evolution. 2016 Jan;70(1):7-17. doi: 10.1111/evo.12832. Epub 2015 Dec 17.
3
Genomic data do not support comb jellies as the sister group to all other animals.基因组数据并不支持栉水母是所有其他动物的姐妹群这一观点。
Phylogeny and evolution of hemipteran insects based on expanded genomic and transcriptomic data.基于扩展基因组和转录组数据的半翅目昆虫的系统发育和进化。
BMC Biol. 2024 Sep 2;22(1):190. doi: 10.1186/s12915-024-01991-1.
4
The genus Cortinarius should not (yet) be split.丝膜菌属目前不应被拆分。
IMA Fungus. 2024 Aug 13;15(1):24. doi: 10.1186/s43008-024-00159-4.
5
Comprehensive Identification and Expression Profiling of Epidermal Pattern Factor () Gene Family in Oilseed Rape ( L.) under Salt Stress.盐胁迫下油菜( Brassica napus L. )表皮模式因子()基因家族的综合鉴定和表达谱分析。
Genes (Basel). 2024 Jul 12;15(7):912. doi: 10.3390/genes15070912.
6
Integrating phylogenies into single-cell RNA sequencing analysis allows comparisons across species, genes, and cells.将系统发生关系整合到单细胞 RNA 测序分析中,允许在物种、基因和细胞之间进行比较。
PLoS Biol. 2024 May 24;22(5):e3002633. doi: 10.1371/journal.pbio.3002633. eCollection 2024 May.
7
PhylteR: Efficient Identification of Outlier Sequences in Phylogenomic Datasets.PhylteR:系统发生基因组数据集中外点序列的有效识别。
Mol Biol Evol. 2023 Nov 3;40(11). doi: 10.1093/molbev/msad234.
8
A genome‑wide approach to the systematic and comprehensive analysis of LIM gene family in sorghum (Sorghum bicolor L.).一种对高粱(Sorghum bicolor L.)中LIM基因家族进行系统全面分析的全基因组方法。
Genomics Inform. 2023 Sep;21(3):e36. doi: 10.5808/gi.23007. Epub 2023 Sep 27.
9
Major Revisions in Pancrustacean Phylogeny and Evidence of Sensitivity to Taxon Sampling.泛甲壳动物系统发育的重大修订和对分类群采样敏感性的证据。
Mol Biol Evol. 2023 Aug 3;40(8). doi: 10.1093/molbev/msad175.
10
Identification of the Light-Harvesting Chlorophyll a/b Binding Protein Gene Family in Peach ( L.) and Their Expression under Drought Stress.鉴定桃( L.)中的光捕获叶绿素 a/b 结合蛋白基因家族及其在干旱胁迫下的表达。
Genes (Basel). 2023 Jul 19;14(7):1475. doi: 10.3390/genes14071475.
Proc Natl Acad Sci U S A. 2015 Dec 15;112(50):15402-7. doi: 10.1073/pnas.1518127112. Epub 2015 Nov 30.
4
Phylogenomics Controlling for Base Compositional Bias Reveals a Single Origin of Eusociality in Corbiculate Bees.系统基因组学控制碱基组成偏差揭示了熊蜂超社会性的单一起源。
Mol Biol Evol. 2016 Mar;33(3):670-8. doi: 10.1093/molbev/msv258. Epub 2015 Nov 17.
5
Selecting Question-Specific Genes to Reduce Incongruence in Phylogenomics: A Case Study of Jawed Vertebrate Backbone Phylogeny.选择特定问题的基因以减少系统基因组学中的不一致性:以有颌脊椎动物骨架系统发育为例。
Syst Biol. 2015 Nov;64(6):1104-20. doi: 10.1093/sysbio/syv059. Epub 2015 Aug 13.
6
Evidence of Statistical Inconsistency of Phylogenetic Methods in the Presence of Multiple Sequence Alignment Uncertainty.在存在多序列比对不确定性的情况下系统发育方法统计不一致性的证据。
Genome Biol Evol. 2015 Jul 1;7(8):2102-16. doi: 10.1093/gbe/evv127.
7
Can We Identify Genes with Increased Phylogenetic Reliability?我们能否识别出系统发育可靠性增加的基因?
Syst Biol. 2015 Sep;64(5):824-37. doi: 10.1093/sysbio/syv041. Epub 2015 Jun 22.
8
Determinants of the rate of protein sequence evolution.蛋白质序列进化速率的决定因素。
Nat Rev Genet. 2015 Jul;16(7):409-20. doi: 10.1038/nrg3950. Epub 2015 Jun 9.
9
GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters.指南2:考虑多个参数的不确定性,准确检测不可靠的比对区域。
Nucleic Acids Res. 2015 Jul 1;43(W1):W7-14. doi: 10.1093/nar/gkv318. Epub 2015 Apr 16.
10
Evaluating the performance of anchored hybrid enrichment at the tips of the tree of life: a phylogenetic analysis of Australian Eugongylus group scincid lizards.评估生命之树末端锚定杂交富集的性能:澳大利亚真硬蜥属石龙子蜥蜴的系统发育分析。
BMC Evol Biol. 2015 Apr 11;15:62. doi: 10.1186/s12862-015-0318-0.