• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过奇异值分解将相同分类单元集上的距离矩阵合并用于多基因分析。

Combining distance matrices on identical taxon sets for multi-gene analysis with singular value decomposition.

作者信息

Abeysundera Melanie, Kenney Toby, Field Chris, Gu Hong

机构信息

Department of Mathematics and Statistics, Dalhousie University, Halifax, Canada.

出版信息

PLoS One. 2014 Apr 14;9(4):e94279. doi: 10.1371/journal.pone.0094279. eCollection 2014.

DOI:10.1371/journal.pone.0094279
PMID:24732341
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3986248/
Abstract

We present a simple and effective method for combining distance matrices from multiple genes on identical taxon sets to obtain a single representative distance matrix from which to derive a combined-gene phylogenetic tree. The method applies singular value decomposition (SVD) to extract the greatest common signal present in the distances obtained from each gene. The first right eigenvector of the SVD, which corresponds to a weighted average of the distance matrices of all genes, can thus be used to derive a representative tree from multiple genes. We apply our method to three well known data sets and estimate the uncertainty using bootstrap methods. Our results show that this method works well for these three data sets and that the uncertainty in these estimates is small. A simulation study is conducted to compare the performance of our method with several other distance based approaches (namely SDM, SDM* and ACS97), and we find the performances of all these approaches are comparable in the consensus setting. The computational complexity of our method is similar to that of SDM. Besides constructing a representative tree from multiple genes, we also demonstrate how the subsequent eigenvalues and eigenvectors may be used to identify if there are conflicting signals in the data and which genes might be influential or outliers for the estimated combined-gene tree.

摘要

我们提出了一种简单有效的方法,用于合并来自相同分类单元集上多个基因的距离矩阵,以获得单个代表性距离矩阵,从而推导出合并基因系统发育树。该方法应用奇异值分解(SVD)来提取从每个基因获得的距离中存在的最大共同信号。SVD的第一个右特征向量对应于所有基因距离矩阵的加权平均值,因此可用于从多个基因推导出代表性树。我们将我们的方法应用于三个著名的数据集,并使用自助法估计不确定性。我们的结果表明,该方法对这三个数据集效果良好,并且这些估计中的不确定性很小。进行了一项模拟研究,以将我们的方法与其他几种基于距离的方法(即SDM、SDM*和ACS97)的性能进行比较,我们发现在共识设置下所有这些方法的性能相当。我们方法的计算复杂度与SDM相似。除了从多个基因构建代表性树之外,我们还展示了后续的特征值和特征向量可如何用于识别数据中是否存在冲突信号,以及哪些基因可能对估计的合并基因树有影响或属于异常值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/5557dfe32191/pone.0094279.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/f519b0e6e523/pone.0094279.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/ba0091c1b303/pone.0094279.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/6cd251dd4561/pone.0094279.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/fe7e3b998c45/pone.0094279.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/06cabf58c0a9/pone.0094279.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/73a83ad54254/pone.0094279.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/22c5f7e62e71/pone.0094279.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/9f2d3e12cc04/pone.0094279.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/5557dfe32191/pone.0094279.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/f519b0e6e523/pone.0094279.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/ba0091c1b303/pone.0094279.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/6cd251dd4561/pone.0094279.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/fe7e3b998c45/pone.0094279.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/06cabf58c0a9/pone.0094279.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/73a83ad54254/pone.0094279.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/22c5f7e62e71/pone.0094279.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/9f2d3e12cc04/pone.0094279.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4db7/3986248/5557dfe32191/pone.0094279.g009.jpg

相似文献

1
Combining distance matrices on identical taxon sets for multi-gene analysis with singular value decomposition.通过奇异值分解将相同分类单元集上的距离矩阵合并用于多基因分析。
PLoS One. 2014 Apr 14;9(4):e94279. doi: 10.1371/journal.pone.0094279. eCollection 2014.
2
Fast NJ-like algorithms to deal with incomplete distance matrices.用于处理不完整距离矩阵的类似快速NJ的算法。
BMC Bioinformatics. 2008 Mar 26;9:166. doi: 10.1186/1471-2105-9-166.
3
SDM: a fast distance-based approach for (super) tree building in phylogenomics.SDM:一种用于系统发育基因组学中(超)树构建的基于距离的快速方法。
Syst Biol. 2006 Oct;55(5):740-55. doi: 10.1080/10635150600969872.
4
A resampling method for estimating the signal subspace of spatio-temporal EEG/MEG data.一种用于估计时空脑电/脑磁图数据信号子空间的重采样方法。
IEEE Trans Biomed Eng. 2003 Aug;50(8):935-49. doi: 10.1109/TBME.2003.814293.
5
Invariant transformers of Robinson and Foulds distance matrices for Convolutional Neural Network.不变的 Robinson 和 Foulds 距离矩阵变换用于卷积神经网络。
J Bioinform Comput Biol. 2022 Aug;20(4):2250012. doi: 10.1142/S0219720022500123. Epub 2022 Jul 6.
6
Mathematical and Simulation-Based Analysis of the Behavior of Admixed Taxa in the Neighbor-Joining Algorithm.基于数学和模拟的邻接法中混合分类群行为分析。
Bull Math Biol. 2019 Feb;81(2):452-493. doi: 10.1007/s11538-018-0444-0. Epub 2018 Jun 6.
7
Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences.基于整个质体和整个线粒体基因组序列推断的基因组BLAST距离系统发育树。
BMC Bioinformatics. 2006 Jul 19;7:350. doi: 10.1186/1471-2105-7-350.
8
Phylogenetic inference with weighted codon evolutionary distances.基于加权密码子进化距离的系统发育推断。
J Mol Evol. 2009 Apr;68(4):377-92. doi: 10.1007/s00239-009-9212-y. Epub 2009 Mar 24.
9
Accuracy of estimated phylogenetic trees from molecular data. II. Gene frequency data.基于分子数据的系统发育树估计的准确性。II. 基因频率数据。
J Mol Evol. 1983;19(2):153-70. doi: 10.1007/BF02300753.
10
Accelerated Singular Value-Based Ultrasound Blood Flow Clutter Filtering With Randomized Singular Value Decomposition and Randomized Spatial Downsampling.基于加速奇异值分解和随机空间降采样的随机奇异值超声血流杂波滤波。
IEEE Trans Ultrason Ferroelectr Freq Control. 2017 Apr;64(4):706-716. doi: 10.1109/TUFFC.2017.2665342. Epub 2017 Feb 7.

引用本文的文献

1
Compensatory Base Changes in ITS2 Secondary Structure Alignment, Modelling, and Molecular Phylogeny: An Integrated Approach to Improve Species Delimitation in (Basidiomycota).ITS2二级结构比对、建模和分子系统发育中的补偿性碱基变化:一种改进担子菌门物种界定的综合方法
J Fungi (Basel). 2023 Aug 31;9(9):894. doi: 10.3390/jof9090894.
2
A Robust ANOVA Approach to Estimating a Phylogeny from Multiple Genes.一种用于从多个基因估计系统发育的稳健方差分析方法。
Mol Biol Evol. 2015 Aug;32(8):2186-94. doi: 10.1093/molbev/msv084. Epub 2015 Apr 3.

本文引用的文献

1
CONSENSUS CLADOGRAMS AND GENERAL CLASSIFICATIONS.共识分支图与通用分类法。
Cladistics. 1985 Mar;1(2):186-189. doi: 10.1111/j.1096-0031.1985.tb00421.x.
2
ON SIMULTANEOUS ANALYSIS.关于同步分析
Cladistics. 1996 Sep;12(3):221-241. doi: 10.1111/j.1096-0031.1996.tb00010.x.
3
Coalescent-based species tree inference from gene tree topologies under incomplete lineage sorting by maximum likelihood.基于最大似然法的不完全谱系分选下基于基因树拓扑结构的合并种系树推断。
Evolution. 2012 Mar;66(3):763-775. doi: 10.1111/j.1558-5646.2011.01476.x. Epub 2011 Nov 2.
4
Phylogenetic analysis based on spectral methods.基于谱方法的系统发育分析。
Mol Biol Evol. 2012 Feb;29(2):579-97. doi: 10.1093/molbev/msr205. Epub 2011 Aug 30.
5
Amborella not a "basal angiosperm"? Not so fast.单沟木兰不是“基干被子植物”?没那么简单。
Am J Bot. 2004 Jun;91(6):997-1001. doi: 10.3732/ajb.91.6.997.
6
Angiosperm phylogeny: 17 genes, 640 taxa.被子植物系统发育:17 个基因,640 个分类单元。
Am J Bot. 2011 Apr;98(4):704-30. doi: 10.3732/ajb.1000404. Epub 2011 Apr 8.
7
Genome-scale phylogenetics: inferring the plant tree of life from 18,896 gene trees.基因组规模系统发生学:从 18896 个基因树推断植物的生命之树。
Syst Biol. 2011 Mar;60(2):117-25. doi: 10.1093/sysbio/syq072. Epub 2010 Dec 24.
8
Sources of error inherent in species-tree estimation: impact of mutational and coalescent effects on accuracy and implications for choosing among different methods.种系树估计中固有的误差源:突变和合并效应对准确性的影响,以及对选择不同方法的影响。
Syst Biol. 2010 Oct;59(5):573-83. doi: 10.1093/sysbio/syq047. Epub 2010 Sep 10.
9
The position of gnetales among seed plants: overcoming pitfalls of chloroplast phylogenomics.木贼纲植物在种子植物中的位置:克服叶绿体系统发生基因组学的陷阱。
Mol Biol Evol. 2010 Dec;27(12):2855-63. doi: 10.1093/molbev/msq170. Epub 2010 Jul 2.
10
General heterotachy and distance method adjustments.一般异速和距离法调整。
Mol Biol Evol. 2009 Dec;26(12):2689-97. doi: 10.1093/molbev/msp184. Epub 2009 Aug 17.