探索基因表达的功能全景：对大型微阵列数据集的定向搜索。

Exploring the functional landscape of gene expression: directed search of large microarray compendia.

作者信息

Hibbs Matthew A, Hess David C, Myers Chad L, Huttenhower Curtis, Li Kai, Troyanskaya Olga G

机构信息

Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA.

出版信息

Bioinformatics. 2007 Oct 15;23(20):2692-9. doi: 10.1093/bioinformatics/btm403. Epub 2007 Aug 27.

DOI:10.1093/bioinformatics/btm403

PMID:17724061

Abstract

MOTIVATION

The increasing availability of gene expression microarray technology has resulted in the publication of thousands of microarray gene expression datasets investigating various biological conditions. This vast repository is still underutilized due to the lack of methods for fast, accurate exploration of the entire compendium.

RESULTS

We have collected Saccharomyces cerevisiae gene expression microarray data containing roughly 2400 experimental conditions. We analyzed the functional coverage of this collection and we designed a context-sensitive search algorithm for rapid exploration of the compendium. A researcher using our system provides a small set of query genes to establish a biological search context; based on this query, we weight each dataset's relevance to the context, and within these weighted datasets we identify additional genes that are co-expressed with the query set. Our method exhibits an average increase in accuracy of 273% compared to previous mega-clustering approaches when recapitulating known biology. Further, we find that our search paradigm identifies novel biological predictions that can be verified through further experimentation. Our methodology provides the ability for biological researchers to explore the totality of existing microarray data in a manner useful for drawing conclusions and formulating hypotheses, which we believe is invaluable for the research community.

AVAILABILITY

Our query-driven search engine, called SPELL, is available at http://function.princeton.edu/SPELL.

SUPPLEMENTARY INFORMATION

Several additional data files, figures and discussions are available at http://function.princeton.edu/SPELL/supplement.

摘要

动机

基因表达微阵列技术的可用性不断提高，已促成了数千个研究各种生物学条件的微阵列基因表达数据集的发表。由于缺乏对整个数据集进行快速、准确探索的方法，这个庞大的知识库仍未得到充分利用。

结果

我们收集了包含约2400个实验条件的酿酒酵母基因表达微阵列数据。我们分析了该数据集的功能覆盖范围，并设计了一种上下文敏感搜索算法，用于快速探索该数据集。使用我们系统的研究人员提供一小组查询基因以建立生物学搜索上下文；基于此查询，我们对每个数据集与上下文的相关性进行加权，并在这些加权数据集中识别与查询集共表达的其他基因。在概括已知生物学信息时，与之前的超级聚类方法相比，我们的方法准确率平均提高了273%。此外，我们发现我们的搜索范式能够识别可通过进一步实验验证的新生物学预测。我们的方法使生物学研究人员能够以有助于得出结论和形成假设的方式探索现有微阵列数据的全部内容，我们认为这对研究界来说是非常宝贵的。

可用性

我们的查询驱动搜索引擎名为SPELL，可在http://function.princeton.edu/SPELL获取。

补充信息

其他几个数据文件、图表和讨论可在http://function.princeton.edu/SPELL/supplement获取。

相似文献

Exploring the functional landscape of gene expression: directed search of large microarray compendia.

Bioinformatics. 2007 Oct 15;23(20):2692-9. doi: 10.1093/bioinformatics/btm403. Epub 2007 Aug 27.

Integration of GO annotations in Correspondence Analysis: facilitating the interpretation of microarray data.

Bioinformatics. 2005 May 15;21(10):2424-9. doi: 10.1093/bioinformatics/bti367. Epub 2005 Mar 3.

Combining gene expression profiles and protein-protein interaction data to infer gene functions.

J Biotechnol. 2006 Jul 25;124(3):475-85. doi: 10.1016/j.jbiotec.2006.01.024. Epub 2006 Mar 13.

Bioinformatics. 2007 Nov 15;23(22):3103-4. doi: 10.1093/bioinformatics/btm462. Epub 2007 Sep 25.

Knowledge guided analysis of microarray data.

J Biomed Inform. 2006 Aug;39(4):401-11. doi: 10.1016/j.jbi.2005.08.004. Epub 2005 Sep 15.

Large scale data mining approach for gene-specific standardization of microarray gene expression data.

Bioinformatics. 2006 Dec 1;22(23):2898-904. doi: 10.1093/bioinformatics/btl500. Epub 2006 Oct 10.

Analysis of a Gibbs sampler method for model-based clustering of gene expression data.

Bioinformatics. 2008 Jan 15;24(2):176-83. doi: 10.1093/bioinformatics/btm562. Epub 2007 Nov 22.

SEGS: search for enriched gene sets in microarray data.

J Biomed Inform. 2008 Aug;41(4):588-601. doi: 10.1016/j.jbi.2007.12.001. Epub 2007 Dec 15.

A Gibbs sampler for the identification of gene expression and network connectivity consistency.

Bioinformatics. 2006 Dec 15;22(24):3040-6. doi: 10.1093/bioinformatics/btl541. Epub 2006 Oct 23.

Context-sensitive data integration and prediction of biological networks.

Bioinformatics. 2007 Sep 1;23(17):2322-30. doi: 10.1093/bioinformatics/btm332. Epub 2007 Jun 28.

引用本文的文献

Exploring weighting schemes for the discovery of informative generalized between pathway models to uncover pathways in genetic interaction networks.

Sci Rep. 2025 Aug 18;15(1):30169. doi: 10.1038/s41598-025-16353-2.

Clu1/Clu form mitochondria-associated granules upon metabolic transitions and regulate mitochondrial protein translation via ribosome interactions.

PLoS Genet. 2025 Jul 7;21(7):e1011773. doi: 10.1371/journal.pgen.1011773. eCollection 2025 Jul.

Mitochondrial-ER Contact Sites and Tethers Influence the Biosynthesis and Function of Coenzyme Q.

Contact (Thousand Oaks). 2025 Feb 3;8:25152564251316350. doi: 10.1177/25152564251316350. eCollection 2025 Jan-Dec.

Nonfunctional coq10 mutants maintain the ERMES complex and reveal true phenotypes associated with the loss of the coenzyme Q chaperone protein Coq10.

J Biol Chem. 2024 Nov;300(11):107820. doi: 10.1016/j.jbc.2024.107820. Epub 2024 Sep 27.

Divergence in the Saccharomyces Species' Heat Shock Response Is Indicative of Their Thermal Tolerance.

Genome Biol Evol. 2023 Nov 1;15(11). doi: 10.1093/gbe/evad207.

Global analysis of the yeast knockout phenome.

Sci Adv. 2023 May 26;9(21):eadg5702. doi: 10.1126/sciadv.adg5702.

Compendium-Wide Analysis of Pseudomonas aeruginosa Core and Accessory Genes Reveals Transcriptional Patterns across Strains PAO1 and PA14.

mSystems. 2023 Feb 23;8(1):e0034222. doi: 10.1128/msystems.00342-22. Epub 2022 Dec 21.

REFINING CELLULAR PATHWAY MODELS USING AN ENSEMBLE OF HETEROGENEOUS DATA SOURCES.

Ann Appl Stat. 2018 Sep;12(3):1361-1384. doi: 10.1214/16-aoas915. Epub 2018 Sep 11.

Distinct chromosomal "niches" in the genome of provide the background for genomic innovation and shape the fate of gene duplicates.

NAR Genom Bioinform. 2022 Nov 14;4(4):lqac086. doi: 10.1093/nargab/lqac086. eCollection 2022 Dec.

BIONIC: biological network integration using convolutions.

Nat Methods. 2022 Oct;19(10):1250-1261. doi: 10.1038/s41592-022-01616-x. Epub 2022 Oct 3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

探索基因表达的功能全景：对大型微阵列数据集的定向搜索。

Exploring the functional landscape of gene expression: directed search of large microarray compendia.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

SUPPLEMENTARY INFORMATION

动机

结果

可用性

补充信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献