计算基因表达数据的最大相似性双聚类

Computing the maximum similarity bi-clusters of gene expression data.

作者信息

Liu Xiaowen, Wang Lusheng

机构信息

Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong.

出版信息

Bioinformatics. 2007 Jan 1;23(1):50-6. doi: 10.1093/bioinformatics/btl560. Epub 2006 Nov 7.

DOI:10.1093/bioinformatics/btl560

PMID:17090578

Abstract

MOTIVATIONS

Bi-clustering is an important approach in microarray data analysis. The underlying bases for using bi-clustering in the analysis of gene expression data are (1) similar genes may exhibit similar behaviors only under a subset of conditions, not all conditions, (2) genes may participate in more than one function, resulting in one regulation pattern in one context and a different pattern in another. Using bi-clustering algorithms, one can obtain sets of genes that are co-regulated under subsets of conditions.

RESULTS

We develop a polynomial time algorithm to find an optimal bi-cluster with the maximum similarity score. To our knowledge, this is the first formulation for bi-cluster problems that admits a polynomial time algorithm for optimal solutions. The algorithm works for a special case, where the bi-clusters are approximately squares. We then extend the algorithm to handle various kinds of other cases. Experiments on simulation data and real data show that the new algorithms outperform most of the existing methods in many cases. Our new algorithms have the following advantages: (1) no discretization procedure is required, (2) performs well for overlapping bi-clusters and (3) works well for additive bi-clusters.

AVAILABILITY

The software is available at http://www.cs.cityu.edu.hk/~liuxw/msbe/help.html.

摘要

动机

双聚类是微阵列数据分析中的一种重要方法。在基因表达数据分析中使用双聚类的潜在依据是：（1）相似基因可能仅在部分条件下，而非所有条件下表现出相似行为；（2）基因可能参与多种功能，导致在一种情况下呈现一种调控模式，而在另一种情况下呈现不同模式。使用双聚类算法，可以获得在部分条件下共同调控的基因集。

结果

我们开发了一种多项式时间算法来找到具有最大相似性得分的最优双聚类。据我们所知，这是双聚类问题的首个公式化表述，它允许使用多项式时间算法来求解最优解。该算法适用于双聚类近似为正方形的特殊情况。然后我们扩展该算法以处理各种其他情况。对模拟数据和真实数据的实验表明，在许多情况下新算法优于大多数现有方法。我们的新算法具有以下优点：（1）无需离散化过程；（2）对重叠双聚类表现良好；（3）对加性双聚类效果良好。

可用性

该软件可在http://www.cs.cityu.edu.hk/~liuxw/msbe/help.html获取。

相似文献

Computing the maximum similarity bi-clusters of gene expression data.计算基因表达数据的最大相似性双聚类

Bioinformatics. 2007 Jan 1;23(1):50-6. doi: 10.1093/bioinformatics/btl560. Epub 2006 Nov 7.

Clustering microarray gene expression data using weighted Chinese restaurant process.使用加权中国餐馆过程对微阵列基因表达数据进行聚类

Bioinformatics. 2006 Aug 15;22(16):1988-97. doi: 10.1093/bioinformatics/btl284. Epub 2006 Jun 9.

Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions.超越共表达关系：时移和反向基因表达谱的局部聚类可识别新的生物学相关相互作用。

J Mol Biol. 2001 Dec 14;314(5):1053-66. doi: 10.1006/jmbi.2000.5219.

Possibilistic approach for biclustering microarray data.用于双聚类微阵列数据的可能性方法。

Comput Biol Med. 2007 Oct;37(10):1426-36. doi: 10.1016/j.compbiomed.2007.01.005. Epub 2007 Mar 8.

Clustering short time series gene expression data.聚类短时间序列基因表达数据。

Bioinformatics. 2005 Jun;21 Suppl 1:i159-68. doi: 10.1093/bioinformatics/bti1022.

Gene expression module discovery using gibbs sampling.使用吉布斯采样进行基因表达模块发现

Genome Inform. 2004;15(1):239-48.

Gene Ontology analysis in multiple gene clusters under multiple hypothesis testing framework.在多重假设检验框架下对多个基因簇进行基因本体分析。

Artif Intell Med. 2007 Oct;41(2):105-15. doi: 10.1016/j.artmed.2007.08.002.

Bi-correlation clustering algorithm for determining a set of co-regulated genes.双相关聚类算法，用于确定一组共同调节的基因。

Bioinformatics. 2009 Nov 1;25(21):2795-801. doi: 10.1093/bioinformatics/btp526. Epub 2009 Sep 3.

Analysis of a Gibbs sampler method for model-based clustering of gene expression data.一种基于模型的基因表达数据聚类的吉布斯采样器方法分析。

Bioinformatics. 2008 Jan 15;24(2):176-83. doi: 10.1093/bioinformatics/btm562. Epub 2007 Nov 22.

A novel approach for discovering overlapping clusters in gene expression data.一种在基因表达数据中发现重叠簇的新方法。

IEEE Trans Biomed Eng. 2009 Jul;56(7):1803-9. doi: 10.1109/TBME.2009.2015055. Epub 2009 Feb 20.

引用本文的文献

funBIalign: a hierachical algorithm for functional motif discovery based on mean squared residue scores.funBIalign：一种基于均方残基分数的用于功能基序发现的分层算法。

Stat Comput. 2025;35(1):11. doi: 10.1007/s11222-024-10537-y. Epub 2024 Dec 10.

Biclustering data analysis: a comprehensive survey.双聚类数据分析：全面综述。

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae342.

Unsupervised Algorithms for Microarray Sample Stratification.非监督算法在微阵列样本分层中的应用。

Methods Mol Biol. 2022;2401:121-146. doi: 10.1007/978-1-0716-1839-4_9.

Identifying Mitochondrial-Related Genes NDUFA10 and NDUFV2 as Prognostic Markers for Prostate Cancer through Biclustering.通过双聚类鉴定与线粒体相关的基因 NDUFA10 和 NDUFV2 作为前列腺癌的预后标志物。

Biomed Res Int. 2021 May 22;2021:5512624. doi: 10.1155/2021/5512624. eCollection 2021.

MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections.MCbiclust：一种从海量转录组学数据集中发现大规模功能相关基因集的新算法。

Nucleic Acids Res. 2017 Sep 6;45(15):8712-8730. doi: 10.1093/nar/gkx590.

A systematic comparative evaluation of biclustering techniques.双聚类技术的系统比较评估

BMC Bioinformatics. 2017 Jan 23;18(1):55. doi: 10.1186/s12859-017-1487-1.

Evaluation of Plaid Models in Biclustering of Gene Expression Data.基因表达数据双聚类中格子模型的评估

Scientifica (Cairo). 2016;2016:3059767. doi: 10.1155/2016/3059767. Epub 2016 Mar 9.

Quality measures for gene expression biclusters.基因表达双聚类的质量度量

PLoS One. 2015 Mar 12;10(3):e0115497. doi: 10.1371/journal.pone.0115497. eCollection 2015.

Biclustering methods: biological relevance and application in gene expression analysis.双聚类方法：生物学相关性及其在基因表达分析中的应用

PLoS One. 2014 Mar 20;9(3):e90801. doi: 10.1371/journal.pone.0090801. eCollection 2014.

Pattern-driven neighborhood search for biclustering of microarray data.基于模式驱动的基因表达数据子矩阵聚类邻域搜索算法。

BMC Bioinformatics. 2012 May 8;13 Suppl 7(Suppl 7):S11. doi: 10.1186/1471-2105-13-S7-S11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

计算基因表达数据的最大相似性双聚类

Computing the maximum similarity bi-clusters of gene expression data.

作者信息

机构信息

出版信息

MOTIVATIONS

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献