Control and Computer Engineering Department, Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129, Torino, Italy.
IEEE/ACM Trans Comput Biol Bioinform. 2011 May-Jun;8(3):577-91. doi: 10.1109/TCBB.2010.90.
Despite great advances in discovering cancer molecular profiles, the proper application of microarray technology to routine clinical diagnostics is still a challenge. Current practices in the classification of microarrays' data show two main limitations: the reliability of the training data sets used to build the classifiers, and the classifiers' performances, especially when the sample to be classified does not belong to any of the available classes. In this case, state-of-the-art algorithms usually produce a high rate of false positives that, in real diagnostic applications, are unacceptable. To address this problem, this paper presents a new cDNA microarray data classification algorithm based on graph theory and is able to overcome most of the limitations of known classification methodologies. The classifier works by analyzing gene expression data organized in an innovative data structure based on graphs, where vertices correspond to genes and edges to gene expression relationships. To demonstrate the novelty of the proposed approach, the authors present an experimental performance comparison between the proposed classifier and several state-of-the-art classification algorithms.
尽管在发现癌症分子特征方面取得了巨大进展,但将微阵列技术恰当地应用于常规临床诊断仍然是一个挑战。目前,微阵列数据的分类实践存在两个主要局限性:用于构建分类器的训练数据集的可靠性,以及分类器的性能,特别是当要分类的样本不属于任何现有类别时。在这种情况下,最先进的算法通常会产生很高的假阳性率,而在实际的诊断应用中,这是不可接受的。为了解决这个问题,本文提出了一种基于图论的新 cDNA 微阵列数据分类算法,能够克服已知分类方法的大多数局限性。该分类器通过分析以基于图的创新数据结构组织的基因表达数据来工作,其中顶点对应于基因,边对应于基因表达关系。为了证明所提出方法的新颖性,作者在实验中比较了所提出的分类器和几种最先进的分类算法的性能。