Suppr超能文献

ENCODE数据协作分析框架:让大规模分析对生物学家更友好。

A framework for collaborative analysis of ENCODE data: making large-scale analyses biologist-friendly.

作者信息

Blankenberg Daniel, Taylor James, Schenck Ian, He Jianbin, Zhang Yi, Ghent Matthew, Veeraraghavan Narayanan, Albert Istvan, Miller Webb, Makova Kateryna D, Hardison Ross C, Nekrutenko Anton

机构信息

Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Penn State University, University Park, Pennsylvania 16802, USA.

出版信息

Genome Res. 2007 Jun;17(6):960-4. doi: 10.1101/gr.5578007.

Abstract

The standardization and sharing of data and tools are the biggest challenges of large collaborative projects such as the Encyclopedia of DNA Elements (ENCODE). Here we describe a compact Web application, Galaxy2(ENCODE), that effectively addresses these issues. It provides an intuitive interface for the deposition and access of data, and features a vast number of analysis tools including operations on genomic intervals, utilities for manipulation of multiple sequence alignments, and molecular evolution algorithms. By providing a direct link between data and analysis tools, Galaxy2(ENCODE) allows addressing biological questions that are beyond the reach of existing software. We use Galaxy2(ENCODE) to show that the ENCODE regions contain >2000 unannotated transcripts under strong purifying selection that are likely functional. We also show that the ENCODE regions are representative of the entire genome by estimating the rate of nucleotide substitution and comparing it to published data. Although each of these analyses is complex, none takes more than 15 min from beginning to end. Finally, we demonstrate how new tools can be added to Galaxy2(ENCODE) with almost no effort. Every section of the manuscript is supplemented with QuickTime screencasts. Galaxy2(ENCODE) and the screencasts can be accessed at http://g2.bx.psu.edu.

摘要

数据和工具的标准化与共享是诸如DNA元件百科全书(ENCODE)等大型合作项目面临的最大挑战。在此,我们描述了一个精简的网络应用程序Galaxy2(ENCODE),它有效地解决了这些问题。它为数据的存储与访问提供了直观的界面,并具有大量分析工具,包括基因组区间操作、多序列比对处理实用工具以及分子进化算法。通过在数据和分析工具之间建立直接联系,Galaxy2(ENCODE)使得解决现有软件难以企及的生物学问题成为可能。我们使用Galaxy2(ENCODE)表明,ENCODE区域包含2000多个处于强纯化选择下的未注释转录本,这些转录本可能具有功能。我们还通过估计核苷酸替换率并将其与已发表数据进行比较,表明ENCODE区域代表了整个基因组。尽管这些分析中的每一项都很复杂,但从始至终没有一项超过15分钟。最后,我们展示了几乎不费吹灰之力就能将新工具添加到Galaxy2(ENCODE)中。手稿的每个部分都配有QuickTime屏幕录像。可通过http://g2.bx.psu.edu访问Galaxy2(ENCODE)和屏幕录像。

相似文献

3
MultiPipMaker: comparative alignment server for multiple DNA sequences.MultiPipMaker:用于多个DNA序列的比较比对服务器。
Curr Protoc Bioinformatics. 2005 Apr;Chapter 10:Unit10.4. doi: 10.1002/0471250953.bi1004s9.
7
ENCODE data in the UCSC Genome Browser: year 5 update.在 UCSC 基因组浏览器中编码数据:第 5 年更新。
Nucleic Acids Res. 2013 Jan;41(Database issue):D56-63. doi: 10.1093/nar/gks1172. Epub 2012 Nov 27.
8
The ENCODE Project at UC Santa Cruz.加州大学圣克鲁兹分校的DNA元件百科全书计划。
Nucleic Acids Res. 2007 Jan;35(Database issue):D663-7. doi: 10.1093/nar/gkl1017. Epub 2006 Dec 13.
9
ENCODE whole-genome data in the UCSC Genome Browser: update 2012.在 UCSC Genome Browser 中对全基因组数据进行编码:2012 年更新。
Nucleic Acids Res. 2012 Jan;40(Database issue):D912-7. doi: 10.1093/nar/gkr1012. Epub 2011 Nov 9.
10
Web tools for predictive toxicology model building.用于预测毒理学模型构建的网络工具。
Expert Opin Drug Metab Toxicol. 2012 Jul;8(7):791-801. doi: 10.1517/17425255.2012.685158. Epub 2012 May 12.

引用本文的文献

4

本文引用的文献

8
HyPhy: hypothesis testing using phylogenies.HyPhy:利用系统发育进行假设检验。
Bioinformatics. 2005 Mar 1;21(5):676-9. doi: 10.1093/bioinformatics/bti079. Epub 2004 Oct 27.
9
The ENCODE (ENCyclopedia Of DNA Elements) Project.DNA 元件百科全书(ENCODE)计划
Science. 2004 Oct 22;306(5696):636-40. doi: 10.1126/science.1105136.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验