一种通过帕累托最优性在树状图比对中计算可靠事件支持度的快速方法。

A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality.

作者信息

To Thu-Hien, Jacox Edwin, Ranwez Vincent, Scornavacca Celine

机构信息

ISEM - Université de Montpellier, CNRS, IRD, EPHE, Place Eugène Bataillon, Montpellier, 34392, France.

Institut de Biologie Computationnelle (IBC), Montpellier, 34095, France.

出版信息

BMC Bioinformatics. 2015 Nov 14;16:384. doi: 10.1186/s12859-015-0803-x.

DOI:10.1186/s12859-015-0803-x

PMID:26573665

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4647304/

Abstract

BACKGROUND

Given a gene and a species tree, reconciliation methods attempt to retrieve the macro-evolutionary events that best explain the discrepancies between the two tree topologies. The DTL parsimonious approach searches for a most parsimonious reconciliation between a gene tree and a (dated) species tree, considering four possible macro-evolutionary events (speciation, duplication, transfer, and loss) with specific costs. Unfortunately, many events are erroneously predicted due to errors in the input trees, inappropriate input cost values or because of the existence of several equally parsimonious scenarios. It is thus crucial to provide a measure of the reliability for predicted events. It has been recently proposed that the reliability of an event can be estimated via its frequency in the set of most parsimonious reconciliations obtained using a variety of reasonable input cost vectors. To compute such a support, a straightforward but time-consuming approach is to generate the costs slightly departing from the original ones, independently compute the set of all most parsimonious reconciliations for each vector, and combine these sets a posteriori. Another proposed approach uses Pareto-optimality to partition cost values into regions which induce reconciliations with the same number of DTL events. The support of an event is then defined as its frequency in the set of regions. However, often, the number of regions is not large enough to provide reliable supports.

RESULTS

We present here a method to compute efficiently event supports via a polynomial-sized graph, which can represent all reconciliations for several different costs. Moreover, two methods are proposed to take into account alternative input costs: either explicitly providing an input cost range or allowing a tolerance for the over cost of a reconciliation. Our methods are faster than the region based method, substantially faster than the sampling-costs approach, and have a higher event-prediction accuracy on simulated data.

CONCLUSIONS

We propose a new approach to improve the accuracy of event supports for parsimonious reconciliation methods to account for uncertainty in the input costs. Furthermore, because of their speed, our methods can be used on large gene families. Our algorithms are implemented in the ecceTERA program, freely available from http://mbb.univ-montp2.fr/MBB/.

摘要

背景

给定一个基因树和一个物种树，和解方法试图找出最能解释这两种树拓扑结构差异的宏观进化事件。DTL简约方法在考虑具有特定成本的四种可能宏观进化事件（物种形成、基因复制、基因转移和基因丢失）的情况下，搜索基因树与（带时间信息的）物种树之间的最简约和解。不幸的是，由于输入树中的错误、不合适的输入成本值或存在多个同样简约的情况，许多事件被错误预测。因此，为预测事件提供可靠性度量至关重要。最近有人提出，可以通过事件在使用各种合理输入成本向量获得的最简约和解集合中的频率来估计事件的可靠性。为了计算这种支持度，一种直接但耗时的方法是生成与原始成本略有不同的成本，为每个向量独立计算所有最简约和解的集合，然后事后合并这些集合。另一种提出的方法使用帕累托最优将成本值划分为导致相同数量DTL事件的和解区域。然后将事件的支持度定义为其在区域集合中的频率。然而，通常区域数量不够大，无法提供可靠的支持度。

结果

我们在此提出一种通过多项式规模的图高效计算事件支持度的方法，该图可以表示几种不同成本的所有和解。此外，还提出了两种方法来考虑替代输入成本：要么明确提供输入成本范围，要么允许对和解的额外成本有一定容忍度。我们的方法比基于区域的方法更快，比采样成本方法快得多，并且在模拟数据上具有更高的事件预测准确性。

结论

我们提出了一种新方法来提高简约和解方法中事件支持度的准确性，以考虑输入成本中的不确定性。此外，由于速度快，我们的方法可用于大型基因家族。我们的算法在ecceTERA程序中实现，可从http://mbb.univ-montp2.fr/MBB/免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71f3/4647304/3ecddc2e590b/12859_2015_803_Fig1_HTML.jpg

相似文献

A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality.

BMC Bioinformatics. 2015 Nov 14;16:384. doi: 10.1186/s12859-015-0803-x.

ecceTERA: comprehensive gene tree-species tree reconciliation using parsimony.

Bioinformatics. 2016 Jul 1;32(13):2056-8. doi: 10.1093/bioinformatics/btw105. Epub 2016 Feb 26.

Support measures to estimate the reliability of evolutionary events predicted by reconciliation methods.

PLoS One. 2013 Oct 4;8(10):e73667. doi: 10.1371/journal.pone.0073667. eCollection 2013.

Resolution and reconciliation of non-binary gene trees with transfers, duplications and losses.

Bioinformatics. 2017 Apr 1;33(7):980-987. doi: 10.1093/bioinformatics/btw778.

Inferring Pareto-optimal reconciliations across multiple event costs under the duplication-loss-coalescence model.

BMC Bioinformatics. 2019 Dec 17;20(Suppl 20):639. doi: 10.1186/s12859-019-3206-6.

Exact Algorithms for Duplication-Transfer-Loss Reconciliation with Non-Binary Gene Trees.

IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1077-1090. doi: 10.1109/TCBB.2017.2710342. Epub 2017 Jun 1.

Structural properties of the reconciliation space and their applications in enumerating nearly-optimal reconciliations between a gene tree and a species tree.

BMC Bioinformatics. 2011 Oct 5;12 Suppl 9(Suppl 9):S7. doi: 10.1186/1471-2105-12-S9-S7.

Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss.

Bioinformatics. 2012 Jun 15;28(12):i283-91. doi: 10.1093/bioinformatics/bts225.

Reconciliation revisited: handling multiple optima when reconciling with duplication, transfer, and loss.

J Comput Biol. 2013 Oct;20(10):738-54. doi: 10.1089/cmb.2013.0073. Epub 2013 Sep 14.

Cophylogeny reconstruction via an approximate Bayesian computation.

Syst Biol. 2015 May;64(3):416-31. doi: 10.1093/sysbio/syu129. Epub 2014 Dec 24.

引用本文的文献

Coevolving Plasmids Drive Gene Flow and Genome Plasticity in Host-Associated Intracellular Bacteria.

Curr Biol. 2021 Jan 25;31(2):346-357.e3. doi: 10.1016/j.cub.2020.10.030. Epub 2020 Nov 5.

Inferring Pareto-optimal reconciliations across multiple event costs under the duplication-loss-coalescence model.

BMC Bioinformatics. 2019 Dec 17;20(Suppl 20):639. doi: 10.1186/s12859-019-3206-6.

Reconciling multiple genes trees via segmental duplications and losses.

Algorithms Mol Biol. 2019 Mar 20;14:7. doi: 10.1186/s13015-019-0139-6. eCollection 2019.

Genomic Changes Associated with the Evolutionary Transitions of Nostoc to a Plant Symbiont.

Mol Biol Evol. 2018 May 1;35(5):1160-1175. doi: 10.1093/molbev/msy029.

本文引用的文献

Pareto-optimal phylogenetic tree reconciliation.

Bioinformatics. 2014 Jun 15;30(12):i87-95. doi: 10.1093/bioinformatics/btu289.

Support measures to estimate the reliability of evolutionary events predicted by reconciliation methods.

PLoS One. 2013 Oct 4;8(10):e73667. doi: 10.1371/journal.pone.0073667. eCollection 2013.

Reconciliation revisited: handling multiple optima when reconciling with duplication, transfer, and loss.

J Comput Biol. 2013 Oct;20(10):738-54. doi: 10.1089/cmb.2013.0073. Epub 2013 Sep 14.

Representing a set of reconciliations in a compact way.

J Bioinform Comput Biol. 2013 Apr;11(2):1250025. doi: 10.1142/S0219720012500254. Epub 2012 Dec 28.

Reconciliation and local gene tree rearrangement can be of mutual profit.

Algorithms Mol Biol. 2013 Apr 8;8(1):12. doi: 10.1186/1748-7188-8-12.

Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss.

Bioinformatics. 2012 Jun 15;28(12):i283-91. doi: 10.1093/bioinformatics/bts225.

Simultaneous identification of duplications and lateral gene transfers.

IEEE/ACM Trans Comput Biol Bioinform. 2011 Mar-Apr;8(2):517-35. doi: 10.1109/TCBB.2010.14.

Rapid evolutionary innovation during an Archaean genetic expansion.

Nature. 2011 Jan 6;469(7328):93-6. doi: 10.1038/nature09649. Epub 2010 Dec 19.

The co phylogeny reconstruction problem is NP-complete.

J Comput Biol. 2011 Jan;18(1):59-65. doi: 10.1089/cmb.2009.0240. Epub 2010 Aug 17.

Jane: a new tool for the cophylogeny reconstruction problem.

Algorithms Mol Biol. 2010 Feb 3;5:16. doi: 10.1186/1748-7188-5-16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种通过帕累托最优性在树状图比对中计算可靠事件支持度的快速方法。

A fast method for calculating reliable event supports in tree reconciliations via Pareto optimality.

作者信息

To Thu-Hien, Jacox Edwin, Ranwez Vincent, Scornavacca Celine

机构信息

ISEM - Université de Montpellier, CNRS, IRD, EPHE, Place Eugène Bataillon, Montpellier, 34392, France.

Institut de Biologie Computationnelle (IBC), Montpellier, 34095, France.