Irigoien I, Fernandez E, Vives S, Arenas C
Departamento de Ciencias de la Computación e Inteligencia Artificial, Euskal Herriko Unibertsitatea, Spain.
Genetika. 2008 Aug;44(8):1137-40.
Microarray technology is increasingly being applied in biological and medical research to address a wide range of problems. Cluster analysis has proven to be a very useful tool for investigating the structure of microarray data. This paper presents a program for clustering microarray data, which is based on the so call path-distance. The algorithm gives in each step a partition in two clusters and no prior assumptions on the structure of clusters are required. It assigns each object (gene or sample) to only one cluster and gives the global optimum for the function that quantifies the adequacy of a given partition of the sample into k clusters. The program was tested on experimental data sets, showing the robustness of the algorithm.
微阵列技术在生物和医学研究中越来越多地被应用于解决广泛的问题。聚类分析已被证明是研究微阵列数据结构的非常有用的工具。本文提出了一个基于所谓路径距离的微阵列数据聚类程序。该算法在每一步都将数据分为两个簇,并且不需要对簇的结构做先验假设。它将每个对象(基因或样本)仅分配到一个簇,并为量化样本划分为k个簇的适当性的函数给出全局最优解。该程序在实验数据集上进行了测试,显示了该算法的稳健性。