Bar-Joseph Z, Gifford D K, Jaakkola T S
Laboratory for Computer Science, MIT, 545 Technology Square, Cambridge, MA 02139, USA.
Bioinformatics. 2001;17 Suppl 1:S22-9. doi: 10.1093/bioinformatics/17.suppl_1.s22.
We present the first practical algorithm for the optimal linear leaf ordering of trees that are generated by hierarchical clustering. Hierarchical clustering has been extensively used to analyze gene expression data, and we show how optimal leaf ordering can reveal biological structure that is not observed with an existing heuristic ordering method. For a tree with n leaves, there are 2(n-1) linear orderings consistent with the structure of the tree. Our optimal leaf ordering algorithm runs in time O(n(4)), and we present further improvements that make the running time of our algorithm practical.
我们提出了一种用于由层次聚类生成的树的最优线性叶排序的首个实用算法。层次聚类已被广泛用于分析基因表达数据,并且我们展示了最优叶排序如何能够揭示用现有的启发式排序方法无法观察到的生物学结构。对于一棵有(n)个叶的树,存在(2(n - 1))种与树的结构一致的线性排序。我们的最优叶排序算法的运行时间为(O(n^4)),并且我们还提出了进一步的改进,使我们算法的运行时间具有实用性。