Theilhaber Joachim, Ulyanov Anatoly, Malanthara Anish, Cole Jack, Xu Dapeng, Nahf Robert, Heuer Michael, Brockel Christoph, Bushnell Steven
Cambridge Genomics Center, Sanofi-Aventis, 26 Landsdowne Street, Cambridge, MA 02139, USA.
BMC Bioinformatics. 2004 Dec 10;5:195. doi: 10.1186/1471-2105-5-195.
Gecko (Gene Expression: Computation and Knowledge Organization) is a complete, high-capacity centralized gene expression analysis system, developed in response to the needs of a distributed user community.
Based on a client-server architecture, with a centralized repository of typically many tens of thousands of Affymetrix scans, Gecko includes automatic processing pipelines for uploading data from remote sites, a data base, a computational engine implementing approximately 50 different analysis tools, and a client application. Among available analysis tools are clustering methods, principal component analysis, supervised classification including feature selection and cross-validation, multi-factorial ANOVA, statistical contrast calculations, and various post-processing tools for extracting data at given error rates or significance levels. On account of its open architecture, Gecko also allows for the integration of new algorithms. The Gecko framework is very general: non-Affymetrix and non-gene expression data can be analyzed as well. A unique feature of the Gecko architecture is the concept of the Analysis Tree (actually, a directed acyclic graph), in which all successive results in ongoing analyses are saved. This approach has proven invaluable in allowing a large (approximately 100 users) and distributed community to share results, and to repeatedly return over a span of years to older and potentially very complex analyses of gene expression data.
The Gecko system is being made publicly available as free software http://sourceforge.net/projects/geckoe. In totality or in parts, the Gecko framework should prove useful to users and system developers with a broad range of analysis needs.
Gecko(基因表达:计算与知识组织)是一个完整的、高容量的集中式基因表达分析系统,是为满足分布式用户群体的需求而开发的。
基于客户端-服务器架构,拥有一个通常包含数万次Affymetrix扫描的集中式存储库,Gecko包括用于从远程站点上传数据的自动处理管道、一个数据库、一个实现约50种不同分析工具的计算引擎以及一个客户端应用程序。可用的分析工具包括聚类方法、主成分分析、包括特征选择和交叉验证的监督分类、多因素方差分析、统计对比计算以及用于在给定错误率或显著性水平下提取数据的各种后处理工具。由于其开放架构,Gecko还允许集成新算法。Gecko框架非常通用:非Affymetrix和非基因表达数据也可以进行分析。Gecko架构的一个独特特征是分析树(实际上是一个有向无环图)的概念,其中正在进行的分析中的所有连续结果都会被保存。这种方法已被证明在允许一个大型(约100个用户)且分布式的群体共享结果,并在数年时间跨度内反复返回对基因表达数据进行更旧且可能非常复杂的分析方面具有巨大价值。
Gecko系统作为免费软件http://sourceforge.net/projects/geckoe公开提供。Gecko框架的整体或部分对于有广泛分析需求的用户和系统开发者应该是有用的。