Suppr超能文献

CORESH:一种基于基因特征的公共基因表达数据集搜索引擎。

CORESH: a gene signature-based search engine for public gene expression datasets.

作者信息

Sukhov Vladimir, Nugmanova Aigul, Vorontsov Yury, Mehrotra Parul, Kleverov Maksim, Ravichandran Kodi, Artyomov Maxim, Sergushichev Alexey

机构信息

Department of Pathology and Immunology, Washington University in St. Louis School of Medicine, St. Louis, MO 63110, United States.

Computer Technologies Laboratory, ITMO University, Saint Petersburg 197101, Russia.

出版信息

Nucleic Acids Res. 2025 May 5. doi: 10.1093/nar/gkaf372.

Abstract

Public data repositories like Gene Expression Omnibus (GEO) contain an extensive amount of data from hundreds of thousands of experiments, making them a valuable resource for researchers. A common scenario for utilizing this resource is to show transcriptional similarity of one's own data to a public dataset as evidence of potentially similar biology. However, when searching for such datasets, researchers are usually limited to keyword-based search, which requires having a specific hypothesis and relies on the presence of high-quality metadata in public datasets. Here, we introduce CORESH, a web server designed to systematically find GEO datasets that match a user-provided gene signature-such as a list of top upregulated genes in response to a treatment-in a data-driven manner. CORESH operates on a compendium of >40 000 human and 40 000 mouse datasets and outputs a ranked list of datasets where the input genes exhibit similar expression patterns. The discovered datasets can then be used to identify experimental conditions associated with the activation of the query signature, offering insights into underlying biological mechanisms and guiding experimental validation. CORESH is freely accessible at https://alserglab.wustl.edu/coresh/, requires no login, and is regularly updated with the latest GEO data.

摘要

像基因表达综合数据库(GEO)这样的公共数据存储库包含来自数十万次实验的大量数据,使其成为研究人员的宝贵资源。利用这一资源的常见情况是将自己的数据与公共数据集的转录相似性作为潜在相似生物学的证据。然而,在搜索此类数据集时,研究人员通常仅限于基于关键词的搜索,这需要有一个特定的假设,并依赖于公共数据集中高质量元数据的存在。在这里,我们介绍了CORESH,一个网络服务器,旨在以数据驱动的方式系统地找到与用户提供的基因特征相匹配的GEO数据集,例如对一种治疗有反应的上调基因列表。CORESH基于一个包含超过40000个人类和40000个小鼠数据集的纲要进行操作,并输出一个数据集排名列表,其中输入基因表现出相似的表达模式。然后,发现的数据集可用于识别与查询特征激活相关的实验条件,深入了解潜在的生物学机制并指导实验验证。CORESH可通过https://alserglab.wustl.edu/coresh/免费访问,无需登录,并定期更新最新的GEO数据。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验