Suppr超能文献

当数据在谱系间存在异质性时,为基于四重奏的方法设计权重。

Designing Weights for Quartet-Based Methods When Data are Heterogeneous Across Lineages.

机构信息

Institut de Matematiques de la UPC-BarcelonaTech (IMTech), Universitat Politècnica de Catalunya and Centre de Recerca Matemàtica, Av. Diagonal 647, 08028, Barcelona, Spain.

University of British Columbia, Vancouver, Canada.

出版信息

Bull Math Biol. 2023 Jun 13;85(7):68. doi: 10.1007/s11538-023-01167-y.

Abstract

Homogeneity across lineages is a general assumption in phylogenetics according to which nucleotide substitution rates are common to all lineages. Many phylogenetic methods relax this hypothesis but keep a simple enough model to make the process of sequence evolution more tractable. On the other hand, dealing successfully with the general case (heterogeneity of rates across lineages) is one of the key features of phylogenetic reconstruction methods based on algebraic tools. The goal of this paper is twofold. First, we present a new weighting system for quartets (ASAQ) based on algebraic and semi-algebraic tools, thus especially indicated to deal with data evolving under heterogeneous rates. This method combines the weights of two previous methods by means of a test based on the positivity of the branch lengths estimated with the paralinear distance. ASAQ is statistically consistent when applied to data generated under the general Markov model, considers rate and base composition heterogeneity among lineages and does not assume stationarity nor time-reversibility. Second, we test and compare the performance of several quartet-based methods for phylogenetic tree reconstruction (namely QFM, wQFM, quartet puzzling, weight optimization and Willson's method) in combination with several systems of weights, including ASAQ weights and other weights based on algebraic and semi-algebraic methods or on the paralinear distance. These tests are applied to both simulated and real data and support weight optimization with ASAQ weights as a reliable and successful reconstruction method that improves upon the accuracy of global methods (such as neighbor-joining or maximum likelihood) in the presence of long branches or on mixtures of distributions on trees.

摘要

系统发育学中的一个普遍假设是,所有谱系的核苷酸替换率都是相同的。许多系统发育方法放松了这一假设,但保持了足够简单的模型,以使序列进化过程更易于处理。另一方面,成功处理一般情况(谱系间的速率异质性)是基于代数工具的系统发育重建方法的关键特征之一。本文的目的有两个。首先,我们提出了一种新的基于代数和半代数工具的四分体加权系统(ASAQ),因此特别适合处理在异速率下进化的数据。该方法通过基于平行距离估计的分支长度的正性测试,将两种先前方法的权重结合起来。ASAQ 在应用于一般马尔可夫模型生成的数据时具有统计一致性,考虑了谱系之间的速率和碱基组成异质性,并且不假设平稳性或时间可逆性。其次,我们测试并比较了几种基于四分体的系统发育树重建方法(即 QFM、wQFM、四分体拼图、权重优化和 Willson 方法)与几种权重系统(包括 ASAQ 权重和基于代数和半代数方法或平行距离的其他权重)的性能。这些测试应用于模拟和真实数据,并支持使用 ASAQ 权重进行权重优化,这是一种可靠且成功的重建方法,在存在长分支或在树上的分布混合时,可提高全局方法(如邻接法或最大似然法)的准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac1f/10264505/70a12bed1373/11538_2023_1167_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验