Barrett Tanya, Troup Dennis B, Wilhite Stephen E, Ledoux Pierre, Evangelista Carlos, Kim Irene F, Tomashevsky Maxim, Marshall Kimberly A, Phillippy Katherine H, Sherman Patti M, Muertter Rolf N, Holko Michelle, Ayanbule Oluwabukunmi, Yefanov Andrey, Soboleva Alexandra
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892, USA.
Nucleic Acids Res. 2011 Jan;39(Database issue):D1005-10. doi: 10.1093/nar/gkq1184. Epub 2010 Nov 21.
A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20,000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.
十年前,基因表达综合数据库(GEO)在美国国立医学图书馆国家生物技术信息中心(NCBI)建立。GEO的最初目标是作为一个公共存储库,用于存储主要通过微阵列技术生成的高通量基因表达数据。然而,研究界很快将微阵列应用于非基因表达研究,包括基因组拷贝数变异检测和全基因组DNA结合蛋白分析。由于GEO数据库的结构设计灵活,因此能够迅速调整存储库以存储这些数据类型。最近,随着微阵列领域转向新一代测序技术,GEO再次进行了调整,以托管这些数据集。如今,GEO存储了超过20000项基于微阵列和序列的功能基因组学研究,并继续处理来自研究界的大部分直接高通量数据提交。该数据库提供了多种机制,以帮助用户在单个基因或整个研究层面有效地搜索、浏览、下载和可视化数据。本文介绍了该数据库最近的改进,包括新的搜索和数据呈现工具,以及对该领域如何使用GEO数据的简要回顾。可通过http://www.ncbi.nlm.nih.gov/geo/免费访问GEO。