Suppr超能文献

检验系统发育学在理论和实践中的决定性

Checking Phylogenetics Decisiveness in Theory and in Practice.

作者信息

Parvini Ghazaleh, Braught Katherine, Fernandez-Baca David

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3307-3316. doi: 10.1109/TCBB.2021.3128381. Epub 2022 Dec 8.

Abstract

Suppose we aim to build a phylogeny for a set of taxa X using information from a collection of loci, where each locus offers information for only a fraction of the taxa. The question is whether, based solely on the pattern of data availability, called a taxon coverage pattern, one can determine if the data suffices to construct a reliable phylogeny. The problem can be expressed combinatorially as follows. Let us call a taxon coverage pattern decisive if, for any binary phylogenetic tree T for X, the collection of phylogenetic trees obtained by restricting T to the subset of X covered by each locus uniquely determines T. Here we relate the problem of checking whether a taxon coverage pattern is decisive to a hypergraph coloring problem. Using this connection, we (1) show that checking decisiveness is co-NP complete; (2) obtain lower bounds on the amount of coverage needed to achieve decisiveness; (3) devise an exact algorithm for decisiveness; (4) develop problem reduction rules, and use them to obtain efficient algorithms for inputs with few loci; and (5) devise Boolean satisfiability (SAT) and integer linear programming formulations (ILP) of decisiveness that allow us to analyze data sets that arise in practice. For data sets that are not decisive, we use our SAT and ILP formulations to obtain decisive subsets of the data.

摘要

假设我们旨在利用一组基因座的信息为一组分类单元X构建系统发育树,其中每个基因座仅为一部分分类单元提供信息。问题在于,仅基于称为分类单元覆盖模式的数据可用性模式,能否确定数据是否足以构建可靠的系统发育树。该问题可以用组合方式表述如下。如果对于X的任何二叉系统发育树T,通过将T限制到每个基因座覆盖的X的子集而得到的系统发育树集合唯一地确定T,我们就称分类单元覆盖模式是决定性的。在这里,我们将检查分类单元覆盖模式是否具有决定性的问题与超图着色问题联系起来。利用这种联系,我们:(1)表明检查决定性是co - NP完全问题;(2)获得实现决定性所需覆盖量的下限;(3)设计一种用于决定性的精确算法;(4)制定问题简化规则,并使用它们为基因座较少的输入获得高效算法;(5)设计决定性的布尔可满足性(SAT)和整数线性规划公式(ILP),使我们能够分析实际中出现的数据集。对于非决定性的数据集,我们使用我们的SAT和ILP公式来获得数据的决定性子集。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验