School of Engineering, Pablo de Olavide University, Ctra. Utrera s/n, Seville, Spain.
Comput Biol Med. 2012 Feb;42(2):245-56. doi: 10.1016/j.compbiomed.2011.11.015. Epub 2011 Dec 21.
Biclustering is becoming a popular technique for the study of gene expression data. This is mainly due to the capability of biclustering to address the data using various dimensions simultaneously, as opposed to clustering, which can use only one dimension at the time. Different heuristics have been proposed in order to discover interesting biclusters in data. Such heuristics have one common characteristic: they are guided by a measure that determines the quality of biclusters. It follows that defining such a measure is probably the most important aspect. One of the popular quality measure is the mean squared residue (MSR). However, it has been proven that MSR fails at identifying some kind of patterns. This motivates us to introduce a novel measure, called virtual error (VE), that overcomes this limitation. Results obtained by using VE confirm that it can identify interesting patterns that could not be found by MSR.
双聚类分析正成为研究基因表达数据的一种流行技术。这主要是因为双聚类分析能够使用各种维度同时处理数据,而聚类分析一次只能使用一个维度。为了在数据中发现有趣的双聚类,已经提出了不同的启发式方法。这些启发式方法有一个共同的特点:它们由一个确定双聚类质量的度量来指导。因此,定义这样的度量可能是最重要的方面。一种流行的质量度量是均方残差(MSR)。然而,已经证明 MSR 在识别某些模式方面存在缺陷。这促使我们引入一种新的度量,称为虚拟误差(VE),它克服了这一限制。使用 VE 得到的结果证实,它可以识别出 MSR 无法发现的有趣模式。