CoGenT++：一个用于计算基因组学的广泛且可扩展的数据环境。

CoGenT++: an extensive and extensible data environment for computational genomics.

作者信息

Goldovsky Leon, Janssen Paul, Ahrén Dag, Audit Benjamin, Cases Ildefonso, Darzentas Nikos, Enright Anton J, López-Bigas Núria, Peregrin-Alvarez José M, Smith Mike, Tsoka Sophia, Kunin Victor, Ouzounis Christos A

机构信息

Computational Genomics Group, The European Bioinformatics Institute EMBL, Cambridge Outstation, Cambridge CB10 1SD, UK.

出版信息

Bioinformatics. 2005 Oct 1;21(19):3806-10. doi: 10.1093/bioinformatics/bti579.

DOI:10.1093/bioinformatics/bti579

PMID:16216832

Abstract

MOTIVATION

CoGenT++ is a data environment for computational research in comparative and functional genomics, designed to address issues of consistency, reproducibility, scalability and accessibility.

DESCRIPTION

CoGenT++ facilitates the re-distribution of all fully sequenced and published genomes, storing information about species, gene names and protein sequences. We describe our scalable implementation of ProXSim, a continually updated all-against-all similarity database, which stores pairwise relationships between all genome sequences. Based on these similarities, derived databases are generated for gene fusions--AllFuse, putative orthologs--OFAM, protein families--TRIBES, phylogenetic profiles--ProfUse and phylogenetic trees. Extensions based on the CoGenT++ environment include disease gene prediction, pattern discovery, automated domain detection, genome annotation and ancestral reconstruction.

CONCLUSION

CoGenT++ provides a comprehensive environment for computational genomics, accessible primarily for large-scale analyses as well as manual browsing.

摘要

动机

CoGenT++是一个用于比较和功能基因组学计算研究的数据环境，旨在解决一致性、可重复性、可扩展性和可访问性问题。

描述

CoGenT++促进所有已完全测序和发表的基因组的重新分发，存储有关物种、基因名称和蛋白质序列的信息。我们描述了ProXSim的可扩展实现，这是一个不断更新的全基因组比对相似性数据库，它存储所有基因组序列之间的成对关系。基于这些相似性，生成了用于基因融合的衍生数据库——AllFuse、假定的直系同源基因——OFAM、蛋白质家族——TRIBES、系统发育谱——ProfUse和系统发育树。基于CoGenT++环境的扩展包括疾病基因预测、模式发现、自动结构域检测、基因组注释和祖先重建。

结论

CoGenT++为计算基因组学提供了一个全面的环境，主要可用于大规模分析以及手动浏览。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

CoGenT++：一个用于计算基因组学的广泛且可扩展的数据环境。

CoGenT++: an extensive and extensible data environment for computational genomics.

作者信息

机构信息

出版信息

MOTIVATION

DESCRIPTION

CONCLUSION

动机

描述

结论

相似文献

引用本文的文献

CoGenT++：一个用于计算基因组学的广泛且可扩展的数据环境。

CoGenT++: an extensive and extensible data environment for computational genomics.

作者信息

机构信息

出版信息

MOTIVATION

DESCRIPTION

CONCLUSION

动机

描述

结论

相似文献

引用本文的文献