Antonov Alexey V, Tetko Igor V, Mader Michael T, Budczies Jan, Mewes Hans W
GSF National Research Center for Environment and Health, Institute for Bioinformatics, Ingolstädter Landstrasse 1, D-85764 Neuherberg, Germany.
Bioinformatics. 2004 Mar 22;20(5):644-52. doi: 10.1093/bioinformatics/btg462. Epub 2004 Jan 22.
Microarray data appear particularly useful to investigate mechanisms in cancer biology and represent one of the most powerful tools to uncover the genetic mechanisms causing loss of cell cycle control. Recently, several different methods to employ microarray data as a diagnostic tool in cancer classification have been proposed. These procedures take changes in the expression of particular genes into account but do not consider disruptions in certain gene interactions caused by the tumor. It is probable that some genes participating in tumor development do not change their expression level dramatically. Thus, they cannot be detected by simple classification approaches used previously. For these reasons, a classification procedure exploiting information related to changes in gene interactions is needed.
We propose a MAximal MArgin Linear Programming (MAMA) method for the classification of tumor samples based on microarray data. This procedure detects groups of genes and constructs models (features) that strongly correlate with particular tumor types. The detected features include genes whose functional relations are changed for particular cancer types. The proposed method was tested on two publicly available datasets and demonstrated a prediction ability superior to previously employed classification schemes.
The MAMA system was developed using the linear programming system LINDO http://www.lindo.com. A Perl script that specifies the optimization problem for this software is available upon request from the authors.
微阵列数据对于研究癌症生物学机制显得尤为有用,并且是揭示导致细胞周期控制丧失的遗传机制的最强大工具之一。最近,已经提出了几种将微阵列数据用作癌症分类诊断工具的不同方法。这些程序考虑了特定基因表达的变化,但没有考虑肿瘤引起的某些基因相互作用的破坏。参与肿瘤发展的一些基因可能不会显著改变其表达水平。因此,它们无法通过先前使用的简单分类方法检测到。出于这些原因,需要一种利用与基因相互作用变化相关信息的分类程序。
我们提出了一种基于微阵列数据对肿瘤样本进行分类的最大边缘线性规划(MAMA)方法。该程序检测基因组并构建与特定肿瘤类型高度相关的模型(特征)。检测到的特征包括其功能关系因特定癌症类型而改变的基因。所提出的方法在两个公开可用的数据集上进行了测试,并证明了其预测能力优于先前使用的分类方案。
MAMA系统是使用线性规划系统LINDO(http://www.lindo.com)开发的。一个指定此软件优化问题的Perl脚本可根据作者要求提供。