Elnitski Laura L, Shah Prachi, Moreland R Travis, Umayam Lowell, Wolfsberg Tyra G, Baxevanis Andreas D
Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.
Genome Res. 2007 Jun;17(6):954-9. doi: 10.1101/gr.5582207.
The Encyclopedia of DNA Elements (ENCODE) project aims to identify and characterize all functional elements in a representative chromosomal sample comprising 1% of the human genome. Data generated by members of The ENCODE Project Consortium are housed in a number of public databases, such as the UCSC Genome Browser, NCBI's Gene Expression Omnibus (GEO), and EBI's ArrayExpress. As such, it is often difficult for biologists to gather all of the ENCODE data from a particular genomic region of interest and integrate them with relevant information found in other public databases. The ENCODEdb portal was developed to address this problem. ENCODEdb provides a unified, single point-of-access to data generated by the ENCODE Consortium, as well as to data from other source databases that lie within ENCODE regions; this provides the user a complete view of all known data in a particular region of interest. ENCODEdb Genomic Context searches allow for the retrieval of information on functional elements annotated within ENCODE regions, including mRNA, EST, and STS sequences; single nucleotide polymorphisms, and UniGene clusters. Information is also retrieved from GEO, OMIM, and major genome sequence browsers. ENCODEdb Consortium Data searches allow users to perform compound queries on array-based ENCODE data available both from GEO and from the UCSC Genome Browser. Results are retrieved from a specific genomic area of interest and can be further manipulated in a variety of contexts, including the UCSC Genome Browser and the Galaxy large-scale genome analysis platform. The ENCODEdb portal is freely accessible at http://research.nhgri.nih.gov/ENCODEdb.
DNA元件百科全书(ENCODE)项目旨在识别和表征人类基因组1%的代表性染色体样本中的所有功能元件。ENCODE项目联盟成员生成的数据存储在多个公共数据库中,如加州大学圣克鲁兹分校基因组浏览器、美国国立医学图书馆的基因表达综合数据库(GEO)和欧洲生物信息研究所的ArrayExpress。因此,生物学家通常很难从感兴趣的特定基因组区域收集所有ENCODE数据,并将其与其他公共数据库中找到的相关信息整合起来。ENCODEdb门户就是为解决这个问题而开发的。ENCODEdb提供了一个统一的单点访问入口,可访问ENCODE联盟生成的数据以及ENCODE区域内其他源数据库的数据;这为用户提供了特定感兴趣区域内所有已知数据的完整视图。ENCODEdb基因组上下文搜索允许检索ENCODE区域内注释的功能元件信息,包括mRNA、EST和STS序列;单核苷酸多态性和基因簇。信息还从GEO、在线孟德尔人类遗传数据库(OMIM)和主要的基因组序列浏览器中检索。ENCODEdb联盟数据搜索允许用户对来自GEO和加州大学圣克鲁兹分校基因组浏览器的基于阵列的ENCODE数据执行复合查询。结果从特定的感兴趣基因组区域检索,并可在各种环境中进一步处理,包括加州大学圣克鲁兹分校基因组浏览器和Galaxy大规模基因组分析平台。可通过http://research.nhgri.nih.gov/ENCODEdb免费访问ENCODEdb门户。