• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

系统发育学中基于秩条件的假设检验。

Hypothesis Testing With Rank Conditions in Phylogenetics.

作者信息

Long Colby, Kubatko Laura

机构信息

Department of Mathematical and Computational Sciences, College of Wooster, Wooster, OH, United States.

Department of Statistics and Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, United States.

出版信息

Front Genet. 2021 Jul 2;12:664357. doi: 10.3389/fgene.2021.664357. eCollection 2021.

DOI:10.3389/fgene.2021.664357
PMID:34276772
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8283673/
Abstract

A phylogenetic model of sequence evolution for a set of taxa is a collection of probability distributions on the 4 possible site patterns that may be observed in their aligned DNA sequences. For a four-taxon model, one can arrange the entries of these probability distributions into three flattening matrices that correspond to the three different unrooted leaf-labeled four-leaf trees, or quartet trees. The flattening matrix corresponding to the tree parameter of the model is known to satisfy certain rank conditions. Methods such as ErikSVD and SVDQuartets take advantage of this observation by applying singular value decomposition to flattening matrices consisting of empirical data. Each possible quartet is assigned an "SVD score" based on how close the flattening is to the set of matrices of the predicted rank. When choosing among possible quartets, the one with the lowest score is inferred to be the phylogeny of the four taxa under consideration. Since an -leaf phylogenetic tree is determined by its quartets, this approach can be generalized to infer larger phylogenies. In this article, we explore using the SVD score as a test statistic to test whether phylogenetic data were generated by a particular quartet tree. To do so, we use several results to approximate the distribution of the SVD score and to give upper bounds on the -value of the associated hypothesis tests. We also apply these hypothesis tests to simulated phylogenetic data and discuss the implications for interpreting SVD scores in rank-based inference methods.

摘要

一组分类群的序列进化系统发育模型是其比对后的DNA序列中可能观察到的4种可能位点模式上的概率分布集合。对于四分类群模型,可以将这些概率分布的条目排列成三个扁平化矩阵,它们对应于三种不同的无根叶标记四叶树,即四重树。已知与模型的树参数对应的扁平化矩阵满足某些秩条件。诸如ErikSVD和SVDQuartets等方法利用这一观察结果,对由经验数据组成的扁平化矩阵应用奇异值分解。根据扁平化与预测秩矩阵集的接近程度,为每个可能的四重树分配一个“SVD分数”。在选择可能的四重树时,分数最低的那个被推断为所考虑的四个分类群的系统发育。由于一个n叶系统发育树由其四重树决定,这种方法可以推广到推断更大的系统发育。在本文中,我们探索使用SVD分数作为检验统计量,以检验系统发育数据是否由特定的四重树生成。为此,我们使用几个结果来近似SVD分数的分布,并给出相关假设检验的p值的上界。我们还将这些假设检验应用于模拟的系统发育数据,并讨论在基于秩的推断方法中解释SVD分数的意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/13a9530d60b8/fgene-12-664357-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/753c67e48787/fgene-12-664357-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/986e39aff227/fgene-12-664357-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/9a7fc58ed85a/fgene-12-664357-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/d254bfe9155e/fgene-12-664357-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/aa165698aca0/fgene-12-664357-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/ad4458b5ad69/fgene-12-664357-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/13a9530d60b8/fgene-12-664357-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/753c67e48787/fgene-12-664357-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/986e39aff227/fgene-12-664357-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/9a7fc58ed85a/fgene-12-664357-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/d254bfe9155e/fgene-12-664357-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/aa165698aca0/fgene-12-664357-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/ad4458b5ad69/fgene-12-664357-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d5b/8283673/13a9530d60b8/fgene-12-664357-g0007.jpg

相似文献

1
Hypothesis Testing With Rank Conditions in Phylogenetics.系统发育学中基于秩条件的假设检验。
Front Genet. 2021 Jul 2;12:664357. doi: 10.3389/fgene.2021.664357. eCollection 2021.
2
Hypothesis tests for phylogenetic quartets, with applications to coalescent-based species tree inference.系统发育四重奏的假设检验及其在基于溯祖理论的物种树推断中的应用。
J Theor Biol. 2016 Nov 7;408:179-186. doi: 10.1016/j.jtbi.2016.08.013. Epub 2016 Aug 11.
3
Identifiability and Reconstructibility of Species Phylogenies Under a Modified Coalescent.修改后的合并模型下的物种系统发育的可识别性和可重建性。
Bull Math Biol. 2019 Feb;81(2):408-430. doi: 10.1007/s11538-018-0456-9. Epub 2018 Jun 20.
4
Species trees from consensus single nucleotide polymorphism (SNP) data: Testing phylogenetic approaches with simulated and empirical data.基于一致性单核苷酸多态性(SNP)数据构建的物种树:使用模拟数据和实证数据检验系统发育方法。
Mol Phylogenet Evol. 2017 Nov;116:192-201. doi: 10.1016/j.ympev.2017.07.018. Epub 2017 Jul 22.
5
Four myriapod relatives - but who are sisters? No end to debates on relationships among the four major myriapod subgroups.四种多足动物的近亲——但谁是姐妹?关于四大多足动物亚群之间的关系的争论永无止境。
BMC Evol Biol. 2020 Nov 4;20(1):144. doi: 10.1186/s12862-020-01699-0.
6
Quartet-Based Computations of Internode Certainty Provide Robust Measures of Phylogenetic Incongruence.基于四分体的节点可信度计算为系统发育分歧提供稳健的度量。
Syst Biol. 2020 Mar 1;69(2):308-324. doi: 10.1093/sysbio/syz058.
7
ILS-Aware Analysis of Low-Homoplasy Retroelement Insertions: Inference of Species Trees and Introgression Using Quartets.基于低同线性反转元件插入的 ILS 分析:使用四联体推断种系发生树和基因渗入
J Hered. 2020 Apr 2;111(2):147-168. doi: 10.1093/jhered/esz076.
8
Parsimony and the rank of a flattening matrix.简约与展平矩阵的秩。
J Math Biol. 2023 Feb 9;86(3):44. doi: 10.1007/s00285-023-01875-y.
9
The Effect of Gene Flow on Coalescent-based Species-Tree Inference.基因流对基于合并的种系发生树推断的影响。
Syst Biol. 2018 Sep 1;67(5):770-785. doi: 10.1093/sysbio/syy020.
10
Quartets enable statistically consistent estimation of cell lineage trees under an unbiased error and missingness model.四重奏法能够在无偏误差和缺失模型下对细胞谱系树进行统计上一致的估计。
Algorithms Mol Biol. 2023 Dec 1;18(1):19. doi: 10.1186/s13015-023-00248-w.

本文引用的文献

1
Hypothesis testing near singularities and boundaries.奇点和边界附近的假设检验。
Electron J Stat. 2019;13(1):2150-2193. doi: 10.1214/19-ejs1576. Epub 2019 Jun 28.
2
Consistency of SVDQuartets and Maximum Likelihood for Coalescent-Based Species Tree Estimation.基于合并的物种树估计的 SVDQuartets 和最大似然的一致性。
Syst Biol. 2021 Jan 1;70(1):33-48. doi: 10.1093/sysbio/syaa039.
3
Identifiability and Reconstructibility of Species Phylogenies Under a Modified Coalescent.修改后的合并模型下的物种系统发育的可识别性和可重建性。
Bull Math Biol. 2019 Feb;81(2):408-430. doi: 10.1007/s11538-018-0456-9. Epub 2018 Jun 20.
4
Implementing and testing the multispecies coalescent model: A valuable paradigm for phylogenomics.实施和测试多物种合并模型:系统发育基因组学的一个有价值的范例。
Mol Phylogenet Evol. 2016 Jan;94(Pt A):447-62. doi: 10.1016/j.ympev.2015.10.027. Epub 2015 Oct 27.
5
Identifiability of the unrooted species tree topology under the coalescent model with time-reversible substitution processes, site-specific rate variation, and invariable sites.在具有时间可逆替换过程、位点特异性速率变异和不变位点的合并模型下无根物种树拓扑结构的可识别性。
J Theor Biol. 2015 Jun 7;374:35-47. doi: 10.1016/j.jtbi.2015.03.006. Epub 2015 Mar 17.
6
Quartet inference from SNP data under the coalescent model.在溯祖模型下从单核苷酸多态性(SNP)数据进行四重奏推断。
Bioinformatics. 2014 Dec 1;30(23):3317-24. doi: 10.1093/bioinformatics/btu530. Epub 2014 Aug 7.
7
The identifiability of tree topology for phylogenetic models, including covarion and mixture models.系统发育模型(包括协变模型和混合模型)的树形拓扑结构的可识别性。
J Comput Biol. 2006 Jun;13(5):1101-13. doi: 10.1089/cmb.2006.13.1101.
8
Generating samples under a Wright-Fisher neutral model of genetic variation.在遗传变异的赖特-费希尔中性模型下生成样本。
Bioinformatics. 2002 Feb;18(2):337-8. doi: 10.1093/bioinformatics/18.2.337.
9
Models of molecular evolution and phylogeny.分子进化与系统发育模型。
Genome Res. 1998 Dec;8(12):1233-44. doi: 10.1101/gr.8.12.1233.
10
Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees.Seq-Gen:一款用于沿系统发育树对DNA序列进化进行蒙特卡洛模拟的应用程序。
Comput Appl Biosci. 1997 Jun;13(3):235-8. doi: 10.1093/bioinformatics/13.3.235.