聚类基因表达模式。

Clustering gene expression patterns.

作者信息

Ben-Dor A, Shamir R, Yakhini Z

机构信息

Department of Computer Science and Engineering, University of Washington, Seattle 98105, USA.

出版信息

J Comput Biol. 1999 Fall-Winter;6(3-4):281-97. doi: 10.1089/106652799318274.

DOI:10.1089/106652799318274

PMID:10582567

Abstract

Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the analysis of gene expression data is the detection of groups of genes that manifest similar expression patterns. The corresponding algorithmic problem is to cluster multicondition gene expression patterns. In this paper we describe a novel clustering algorithm that was developed for analysis of gene expression data. We define an appropriate stochastic error model on the input, and prove that under the conditions of the model, the algorithm recovers the cluster structure with high probability. The running time of the algorithm on an n-gene dataset is O[n2[log(n)]c]. We also present a practical heuristic based on the same algorithmic ideas. The heuristic was implemented and its performance is demonstrated on simulated data and on real gene expression data, with very promising results.

摘要

生物技术的最新进展使研究人员能够在不同条件下并随时间同时测量数千个基因的表达水平。对此类实验产生的数据进行分析，有望深入了解基因功能和调控机制。基因表达数据分析中的一个关键步骤是检测表现出相似表达模式的基因群。相应的算法问题是对多条件基因表达模式进行聚类。在本文中，我们描述了一种为分析基因表达数据而开发的新型聚类算法。我们在输入上定义了一个合适的随机误差模型，并证明在该模型的条件下，该算法以高概率恢复聚类结构。该算法在n基因数据集上的运行时间为O[n2[log(n)]c]。我们还基于相同的算法思想提出了一种实用的启发式方法。该启发式方法已实现，并在模拟数据和真实基因表达数据上展示了其性能，结果非常有前景。

相似文献

Clustering gene expression patterns.

J Comput Biol. 1999 Fall-Winter;6(3-4):281-97. doi: 10.1089/106652799318274.

A mixture model with random-effects components for clustering correlated gene-expression profiles.

Bioinformatics. 2006 Jul 15;22(14):1745-52. doi: 10.1093/bioinformatics/btl165. Epub 2006 May 3.

Clustering microarray gene expression data using weighted Chinese restaurant process.

Bioinformatics. 2006 Aug 15;22(16):1988-97. doi: 10.1093/bioinformatics/btl284. Epub 2006 Jun 9.

Knowledge-assisted recognition of cluster boundaries in gene expression data.

Artif Intell Med. 2005 Sep-Oct;35(1-2):171-83. doi: 10.1016/j.artmed.2005.02.007.

Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm.

Bioinformatics. 2006 Jan 1;22(1):58-67. doi: 10.1093/bioinformatics/bti746. Epub 2005 Oct 27.

Incorporating gene functions as priors in model-based clustering of microarray gene expression data.

Bioinformatics. 2006 Apr 1;22(7):795-801. doi: 10.1093/bioinformatics/btl011. Epub 2006 Jan 24.

Evaluation and comparison of gene clustering methods in microarray analysis.

Bioinformatics. 2006 Oct 1;22(19):2405-12. doi: 10.1093/bioinformatics/btl406. Epub 2006 Jul 31.

A novel approach for discovering overlapping clusters in gene expression data.

IEEE Trans Biomed Eng. 2009 Jul;56(7):1803-9. doi: 10.1109/TBME.2009.2015055. Epub 2009 Feb 20.

Ensemble clustering method based on the resampling similarity measure for gene expression data.

Stat Methods Med Res. 2007 Dec;16(6):539-64. doi: 10.1177/0962280206071842. Epub 2007 Aug 14.

Clustering of change patterns using Fourier coefficients.

Bioinformatics. 2008 Jan 15;24(2):184-91. doi: 10.1093/bioinformatics/btm568. Epub 2007 Nov 19.

引用本文的文献

Cell identity and 5-hydroxymethylcytosine.

Epigenetics Chromatin. 2025 Jun 19;18(1):36. doi: 10.1186/s13072-025-00601-w.

Robust hierarchical co-clustering for exploring toxicogenomic biomarkers and their chemical regulators.

Sci Rep. 2025 May 14;15(1):16676. doi: 10.1038/s41598-025-99568-7.

scDFN: enhancing single-cell RNA-seq clustering with deep fusion networks.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae486.

Long-read sequencing reveals extensive gut phageome structural variations driven by genetic exchange with bacterial hosts.

Sci Adv. 2024 Aug 16;10(33):eadn3316. doi: 10.1126/sciadv.adn3316. Epub 2024 Aug 14.

Network Topology Evaluation and Transitive Alignments for Molecular Networking.

J Am Soc Mass Spectrom. 2024 Sep 4;35(9):2165-2175. doi: 10.1021/jasms.4c00208. Epub 2024 Aug 12.

Decomposition of dynamic transcriptomic responses during effector-triggered immunity reveals conserved responses in two distinct plant cell populations.

Plant Commun. 2024 Aug 12;5(8):100882. doi: 10.1016/j.xplc.2024.100882. Epub 2024 Mar 16.

Molecular Engineering to Achieve AIE-active Fluorophore with Near-infrared (NIR) Emission and Temperature-sensitive Property.

J Fluoresc. 2024 May;34(3):1109-1117. doi: 10.1007/s10895-023-03338-5. Epub 2023 Jul 20.

Statistical analysis of high-dimensional biomedical data: a gentle introduction to analytical goals, common approaches and challenges.

BMC Med. 2023 May 15;21(1):182. doi: 10.1186/s12916-023-02858-y.

PriPath: identifying dysregulated pathways from differential gene expression via grouping, scoring, and modeling with an embedded feature selection approach.

BMC Bioinformatics. 2023 Feb 23;24(1):60. doi: 10.1186/s12859-023-05187-2.

scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.

Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

聚类基因表达模式。

Clustering gene expression patterns.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献