Murali T M, Kasif Simon
Bioinformatics Program, 48 Cummington St., Boston University, Boston, MA 02152, USA.
Pac Symp Biocomput. 2003:77-88.
We propose a representation for gene expression data called conserved gene expression motifs or XMOTIFs. A gene's expression level is conserved across a set of samples if the gene is expressed with the same abundance in all the samples. A conserved gene expression motif is a subset of genes that is simultaneously conserved across a subset of samples. We present a computational technique to discover large conserved gene motifs that cover all the samples and classes in the data. When applied to published data sets representing different cancers or disease outcomes, our algorithm constructs XMOTIFS that distinguish between the various classes.
我们提出了一种用于基因表达数据的表示方法,称为保守基因表达基序或XMOTIFs。如果一个基因在所有样本中以相同的丰度表达,那么该基因的表达水平在一组样本中是保守的。一个保守基因表达基序是一组基因的子集,这些基因在样本的一个子集中同时是保守的。我们提出了一种计算技术来发现覆盖数据中所有样本和类别的大型保守基因基序。当应用于代表不同癌症或疾病结果的已发表数据集时,我们的算法构建了能够区分不同类别的XMOTIFs。