Tadepalli Satish, Ramakrishnan Naren, Watson Layne T, Mishra Bud, Helm Richard F
Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA.
J Bioinform Comput Biol. 2009 Apr;7(2):339-56. doi: 10.1142/s0219720009004114.
We present a new approach to segmenting multiple time series by analyzing the dynamics of cluster formation and rearrangement around putative segment boundaries. This approach finds application in distilling large numbers of gene expression profiles into temporal relationships underlying biological processes. By directly minimizing information-theoretic measures of segmentation quality derived from Kullback-Leibler (KL) divergences, our formulation reveals clusters of genes along with a segmentation such that clusters show concerted behavior within segments but exhibit significant regrouping across segmentation boundaries. The results of the segmentation algorithm can be summarized as Gantt charts revealing temporal dependencies in the ordering of key biological processes. Applications to the yeast metabolic cycle and the yeast cell cycle are described.
我们提出了一种通过分析围绕假定片段边界的聚类形成和重排动态来分割多个时间序列的新方法。这种方法可用于将大量基因表达谱提炼为生物过程背后的时间关系。通过直接最小化从库尔贝克-莱布勒(KL)散度导出的分割质量的信息论度量,我们的公式揭示了基因簇以及一种分割,使得簇在片段内表现出协同行为,但在分割边界处表现出显著的重新分组。分割算法的结果可以总结为甘特图,揭示关键生物过程排序中的时间依赖性。描述了该方法在酵母代谢周期和酵母细胞周期中的应用。