Sundberg Kenneth, Clement Mark, Snell Quinn
Computer Science Department, Brigham Young UniversityProvo, UT 84602, USA.
Algorithms Mol Biol. 2010 Jun 8;5(1):26. doi: 10.1186/1748-7188-5-26.
Phylogenetic analysis is becoming an increasingly important tool for biological research. Applications include epidemiological studies, drug development, and evolutionary analysis. Phylogenetic search is a known NP-Hard problem. The size of the data sets which can be analyzed is limited by the exponential growth in the number of trees that must be considered as the problem size increases. A better understanding of the problem space could lead to better methods, which in turn could lead to the feasible analysis of more data sets. We present a definition of phylogenetic tree space and a visualization of this space that shows significant exploitable structure. This structure can be used to develop search methods capable of handling much larger data sets.
系统发育分析正日益成为生物学研究的重要工具。其应用包括流行病学研究、药物开发和进化分析。系统发育搜索是一个已知的NP难题。随着问题规模的增加,由于必须考虑的树的数量呈指数增长,可分析数据集的大小受到限制。对问题空间有更好的理解可能会产生更好的方法,进而能够对更多数据集进行可行的分析。我们给出了系统发育树空间的定义,并对该空间进行了可视化展示,结果表明存在显著的可利用结构。这种结构可用于开发能够处理更大数据集的搜索方法。