WF-MSB：一种基于加权模糊的基因表达数据双聚类方法。

WF-MSB: a weighted fuzzy-based biclustering method for gene expression data.

作者信息

Chen Lien-Chin, Yu Philip S, Tseng Vincent S

机构信息

Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan.

出版信息

Int J Data Min Bioinform. 2011;5(1):89-109. doi: 10.1504/ijdmb.2011.038579.

DOI:10.1504/ijdmb.2011.038579

PMID:21491846

Abstract

Biclustering is an important analysis method on gene expression data for finding a subset of genes sharing compatible expression patterns. Although some biclustering algorithms have been proposed, few provided a query-driven approach for biologists to search the biclusters, which contain a certain gene of interest. In this paper, we proposed a generalised fuzzy-based approach, namely Weighted Fuzzy-based Maximum Similarity Biclustering (WF-MSB), for extracting a query-driven bicluster based on the user-defined reference gene. A fuzzy-based similarity measurement and condition weighting approach are used to extract significant biclusters in expression levels. Both of the most similar bicluster and the most dissimilar bicluster to the reference gene are discovered by WF-MSB. The proposed WF-MSB method was evaluated in comparison with MSBE on a real yeast microarray data and synthetic data sets. The experimental results show that WF-MSB can effectively find the biclusters with significant GO-based functional meanings.

摘要

双聚类是基因表达数据中一种重要的分析方法，用于寻找具有兼容表达模式的基因子集。虽然已经提出了一些双聚类算法，但很少有算法为生物学家提供一种查询驱动的方法来搜索包含特定感兴趣基因的双聚类。在本文中，我们提出了一种基于广义模糊的方法，即基于加权模糊的最大相似性双聚类（WF-MSB），用于基于用户定义的参考基因提取查询驱动的双聚类。基于模糊的相似性度量和条件加权方法用于在表达水平上提取显著的双聚类。WF-MSB既发现了与参考基因最相似的双聚类，也发现了最不相似的双聚类。将所提出的WF-MSB方法与MSBE在真实酵母微阵列数据和合成数据集上进行了比较评估。实验结果表明，WF-MSB能够有效地找到具有基于基因本体（GO）的显著功能意义的双聚类。