Suppr超能文献

基于四分体的节点可信度计算为系统发育分歧提供稳健的度量。

Quartet-Based Computations of Internode Certainty Provide Robust Measures of Phylogenetic Incongruence.

机构信息

Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Centre, South China Agricultural University, 483 Wushan Road, Guangzhou 510642, P.R. China.

Exelixis Lab, Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Schloss-Wolfsbrunnenweg 35, Heidelberg D-68159, Germany.

出版信息

Syst Biol. 2020 Mar 1;69(2):308-324. doi: 10.1093/sysbio/syz058.

Abstract

Incongruence, or topological conflict, is prevalent in genome-scale data sets. Internode certainty (IC) and related measures were recently introduced to explicitly quantify the level of incongruence of a given internal branch among a set of phylogenetic trees and complement regular branch support measures (e.g., bootstrap, posterior probability) that instead assess the statistical confidence of inference. Since most phylogenomic studies contain data partitions (e.g., genes) with missing taxa and IC scores stem from the frequencies of bipartitions (or splits) on a set of trees, IC score calculation typically requires adjusting the frequencies of bipartitions from these partial gene trees. However, when the proportion of missing taxa is high, the scores yielded by current approaches that adjust bipartition frequencies in partial gene trees differ substantially from each other and tend to be overestimates. To overcome these issues, we developed three new IC measures based on the frequencies of quartets, which naturally apply to both complete and partial trees. Comparison of our new quartet-based measures to previous bipartition-based measures on simulated data shows that: (1) on complete data sets, both quartet-based and bipartition-based measures yield very similar IC scores; (2) IC scores of quartet-based measures on a given data set with and without missing taxa are more similar than the scores of bipartition-based measures; and (3) quartet-based measures are more robust to the absence of phylogenetic signal and errors in phylogenetic inference than bipartition-based measures. Additionally, the analysis of an empirical mammalian phylogenomic data set using our quartet-based measures reveals the presence of substantial levels of incongruence for numerous internal branches. An efficient open-source implementation of these quartet-based measures is freely available in the program QuartetScores (https://github.com/lutteropp/QuartetScores).

摘要

不和谐或拓扑冲突在基因组规模的数据集中很常见。内部节点确定性(IC)和相关度量标准最近被引入,以明确量化给定内部分支在一组系统发育树中的不和谐程度,并补充常规分支支持度量标准(例如,自举,后验概率),这些标准评估推断的统计置信度。由于大多数基因组学研究包含具有缺失分类单元的数据分区(例如基因),并且 IC 分数源于一组树的二分法(或分裂)的频率,因此 IC 分数计算通常需要调整这些部分基因树中的二分法频率。然而,当缺失分类单元的比例较高时,当前方法调整部分基因树中的二分法频率所产生的分数彼此之间差异很大,并且往往被高估。为了克服这些问题,我们开发了三种基于四分体频率的新 IC 度量标准,这些标准自然适用于完整和部分树。在模拟数据上,我们的新四分体基于方法与以前的二分体基于方法的比较表明:(1)在完整数据集上,基于四分体和基于二分体的方法都产生非常相似的 IC 分数;(2)给定数据集上有缺失分类单元和没有缺失分类单元的四分体基于方法的 IC 分数比二分体基于方法的分数更相似;(3)四分体基于方法比二分体基于方法更能抵抗缺少系统发育信号和系统发育推断中的错误。此外,使用我们的四分体基于方法对哺乳动物基因组学数据集进行的分析揭示了许多内部分支存在大量不和谐。这些四分体基于方法的高效开源实现可在程序 QuartetScores(https://github.com/lutteropp/QuartetScores)中免费获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验