Suppr超能文献

基于四元组的系统发育分支长度估计

Phylogenomic branch length estimation using quartets.

机构信息

Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States.

Department of Integrative Biology, University of California at Berkeley, Berkeley, CA 94720, United States.

出版信息

Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i185-i193. doi: 10.1093/bioinformatics/btad221.

Abstract

MOTIVATION

Branch lengths and topology of a species tree are essential in most downstream analyses, including estimation of diversification dates, characterization of selection, understanding adaptation, and comparative genomics. Modern phylogenomic analyses often use methods that account for the heterogeneity of evolutionary histories across the genome due to processes such as incomplete lineage sorting. However, these methods typically do not generate branch lengths in units that are usable by downstream applications, forcing phylogenomic analyses to resort to alternative shortcuts such as estimating branch lengths by concatenating gene alignments into a supermatrix. Yet, concatenation and other available approaches for estimating branch lengths fail to address heterogeneity across the genome.

RESULTS

In this article, we derive expected values of gene tree branch lengths in substitution units under an extension of the multispecies coalescent (MSC) model that allows substitutions with varying rates across the species tree. We present CASTLES, a new technique for estimating branch lengths on the species tree from estimated gene trees that uses these expected values, and our study shows that CASTLES improves on the most accurate prior methods with respect to both speed and accuracy.

AVAILABILITY AND IMPLEMENTATION

CASTLES is available at https://github.com/ytabatabaee/CASTLES.

摘要

动机

种系树的分支长度和拓扑结构是大多数下游分析的基础,包括多样化日期的估计、选择特征的描述、适应性的理解和比较基因组学。由于不完全谱系分选等过程,现代系统基因组学分析通常使用考虑基因组中进化历史异质性的方法。然而,这些方法通常不会生成可用于下游应用的单位的分支长度,迫使系统基因组学分析采用替代方法,如通过将基因比对串联成超级矩阵来估计分支长度。然而,串联和其他可用的估计分支长度的方法无法解决基因组中的异质性。

结果

在本文中,我们推导出了在允许跨种系树的替换率变化的多物种合并(MSC)模型的扩展下,替换单位中基因树分支长度的期望值。我们提出了 CASTLES,这是一种从估计的基因树估计种系树上分支长度的新技术,它使用这些期望值,我们的研究表明,CASTLES 在速度和准确性方面都优于最准确的先验方法。

可用性和实现

CASTLES 可在 https://github.com/ytabatabaee/CASTLES 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6969/10311336/c1da42b0d00c/btad221f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验