• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于比对序列数据集串联的似然法树重建可能在统计上不一致。

Likelihood-based tree reconstruction on a concatenation of aligned sequence data sets can be statistically inconsistent.

作者信息

Roch Sebastien, Steel Mike

机构信息

Department of Mathematics, University of Wisconsin-Madison, Madison, WI, USA.

MS Biomathematics Research Centre, University of Canterbury, Christchurch, New Zealand.

出版信息

Theor Popul Biol. 2015 Mar;100C:56-62. doi: 10.1016/j.tpb.2014.12.005. Epub 2014 Dec 26.

DOI:10.1016/j.tpb.2014.12.005
PMID:25545843
Abstract

The reconstruction of a species tree from genomic data faces a double hurdle. First, the (gene) tree describing the evolution of each gene may differ from the species tree, for instance, due to incomplete lineage sorting. Second, the aligned genetic sequences at the leaves of each gene tree provide merely an imperfect estimate of the topology of the gene tree. In this note, we demonstrate formally that a basic statistical problem arises if one tries to avoid accounting for these two processes and analyses the genetic data directly via a concatenation approach. More precisely, we show that, under the multispecies coalescent with a standard site substitution model, maximum likelihood estimation on sequence data that has been concatenated across genes and performed under the incorrect assumption that all sites have evolved independently and identically on a fixed tree is a statistically inconsistent estimator of the species tree. Our results provide a formal justification of simulation results described of Kubatko and Degnan (2007) and others, and complements recent theoretical results by DeGIorgio and Degnan (2010) and Chifman and Kubtako (2014).

摘要

从基因组数据重建物种树面临双重障碍。首先,描述每个基因进化的(基因)树可能与物种树不同,例如,由于不完全谱系分选。其次,每个基因树叶子处的比对遗传序列仅提供了对基因树拓扑结构的不完美估计。在本笔记中,我们正式证明,如果试图避免考虑这两个过程并直接通过拼接方法分析遗传数据,就会出现一个基本的统计问题。更确切地说,我们表明,在具有标准位点替换模型的多物种合并模型下,在所有位点在固定树上独立且同分布进化的错误假设下,对跨基因拼接的序列数据进行最大似然估计是物种树的一个统计不一致估计量。我们的结果为Kubatko和Degnan(2007年)等人描述的模拟结果提供了形式上的证明,并补充了DeGIorgio和Degnan(2010年)以及Chifman和Kubtako(2014年)最近的理论结果。

相似文献

1
Likelihood-based tree reconstruction on a concatenation of aligned sequence data sets can be statistically inconsistent.基于比对序列数据集串联的似然法树重建可能在统计上不一致。
Theor Popul Biol. 2015 Mar;100C:56-62. doi: 10.1016/j.tpb.2014.12.005. Epub 2014 Dec 26.
2
To Include or Not to Include: The Impact of Gene Filtering on Species Tree Estimation Methods.包含还是不包含:基因过滤对物种树估计方法的影响。
Syst Biol. 2018 Mar 1;67(2):285-303. doi: 10.1093/sysbio/syx077.
3
A comparative study of SVDquartets and other coalescent-based species tree estimation methods.SVDquartets与其他基于溯祖理论的物种树估计方法的比较研究。
BMC Genomics. 2015;16 Suppl 10(Suppl 10):S2. doi: 10.1186/1471-2164-16-S10-S2. Epub 2015 Oct 2.
4
Challenges in Species Tree Estimation Under the Multispecies Coalescent Model.多物种溯祖模型下物种树估计的挑战
Genetics. 2016 Dec;204(4):1353-1368. doi: 10.1534/genetics.116.190173.
5
StarBEAST2 Brings Faster Species Tree Inference and Accurate Estimates of Substitution Rates.StarBEAST2实现了更快的物种树推断和替换率的准确估计。
Mol Biol Evol. 2017 Aug 1;34(8):2101-2114. doi: 10.1093/molbev/msx126.
6
Coalescent-based species tree inference from gene tree topologies under incomplete lineage sorting by maximum likelihood.基于最大似然法的不完全谱系分选下基于基因树拓扑结构的合并种系树推断。
Evolution. 2012 Mar;66(3):763-775. doi: 10.1111/j.1558-5646.2011.01476.x. Epub 2011 Nov 2.
7
Why Concatenation Fails Near the Anomaly Zone.为何在异常区域附近串联会失败。
Syst Biol. 2018 Jan 1;67(1):158-169. doi: 10.1093/sysbio/syx063.
8
Concatenation Analyses in the Presence of Incomplete Lineage Sorting.存在不完全谱系分选情况下的串联分析
PLoS Curr. 2015 May 22;7:ecurrents.currents.tol.8d41ac0f13d1abedf4c4a59f5d17b1f7. doi: 10.1371/currents.tol.8d41ac0f13d1abedf4c4a59f5d17b1f7.
9
A stochastic Farris transform for genetic data under the multispecies coalescent with applications to data requirements.多物种合并下遗传数据的随机法里斯变换及其在数据需求方面的应用。
J Math Biol. 2022 Apr 8;84(5):36. doi: 10.1007/s00285-022-01731-5.
10
On the Robustness to Gene Tree Estimation Error (or lack thereof) of Coalescent-Based Species Tree Methods.基于溯祖理论的物种树方法对基因树估计误差的稳健性(或缺乏稳健性)研究
Syst Biol. 2015 Jul;64(4):663-76. doi: 10.1093/sysbio/syv016. Epub 2015 Mar 25.

引用本文的文献

1
Leveraging Weighted Quartet Distributions for Enhanced Species Tree Inference from Genome-Wide Data.利用加权四重奏分布从全基因组数据中增强物种树推断
Genome Biol Evol. 2025 Sep 2;17(9). doi: 10.1093/gbe/evaf159.
2
Concatenation fails to describe the anomalous radiation of giant cockroaches (Blattodea: Blaberidae) despite moderate to low discordance.尽管存在中度到低度的不一致性,但串联法仍无法描述巨型蟑螂(蜚蠊目:硕蠊科)的异常辐射。
BMC Ecol Evol. 2025 Jul 21;25(1):72. doi: 10.1186/s12862-025-02409-4.
3
An updated phylogeny of Boraginales based on the Angiosperms353 probe set: a roadmap for understanding morphological evolution.
基于被子植物353探针集的紫草科更新系统发育树:理解形态演化的路线图。
Ann Bot. 2025 Sep 2;136(1):77-97. doi: 10.1093/aob/mcaf061.
4
wQFM-TREE: highly accurate and scalable quartet-based species tree inference from gene trees.wQFM-TREE:基于四重奏从基因树中进行高精度且可扩展的物种树推断。
Bioinform Adv. 2025 Mar 13;5(1):vbaf053. doi: 10.1093/bioadv/vbaf053. eCollection 2025.
5
Annotated Bioinformatic Pipelines for Phylogenomic Placement of Mitochondrial Genomes.用于线粒体基因组系统发育定位的注释生物信息学管道
Bio Protoc. 2025 Mar 5;15(5):e5232. doi: 10.21769/BioProtoc.5232.
6
WASTER: Practical phylogenomics from low-coverage short reads.WASTER:基于低覆盖度短读长的实用系统发育基因组学
bioRxiv. 2025 Jan 24:2025.01.20.633983. doi: 10.1101/2025.01.20.633983.
7
CASTER: Direct species tree inference from whole-genome alignments.卡斯特:从全基因组比对中直接推断物种树。
Science. 2025 Feb 28;387(6737):eadk9688. doi: 10.1126/science.adk9688.
8
Testing Phylogenetic Placement Accuracy of DNA Barcode Sequences on a Fish Backbone Tree: Implications of Backbone Tree Completeness and Species Representation.测试鱼类主干树上DNA条形码序列的系统发育定位准确性:主干树完整性和物种代表性的影响
Ecol Evol. 2025 Jan 7;15(1):e70817. doi: 10.1002/ece3.70817. eCollection 2025 Jan.
9
Terraces in species tree inference from gene trees.从基因树上推断物种树的阶。
BMC Ecol Evol. 2024 Nov 4;24(1):135. doi: 10.1186/s12862-024-02309-z.
10
A Phylogenomic Backbone for Acoelomorpha Inferred From Transcriptomic Data.基于转录组数据推断的无肠动物系统基因组骨架
Syst Biol. 2025 Feb 10;74(1):70-85. doi: 10.1093/sysbio/syae057.