Son Keun Hong, Cho Je-Yoel
Department of Biochemistry, College of Veterinary Medicine, Seoul National University, Seoul, 08826, Korea.
Comparative Medicine and Disease Research Center (CDRC), Science Research Center (SRC), Seoul National University, Seoul, 08826, Korea.
Bioinformatics. 2025 Mar 29;41(4). doi: 10.1093/bioinformatics/btaf128.
The volume of multi-omics data for diverse species is growing at an unprecedented rate, with new genome assemblies, related annotations, and high-throughput sequencing resources being submitted daily to various genomic data repositories. In response to this data influx, both existing and new databases are establishing optimized hierarchical structures to manage the vast amount of information. However, the lack of accessible command-line tools, combined with the functional limitations and unintuitive design of existing options, presents significant challenges for researchers. This gap underscores a critical need for a tool that enables streamlined retrieval and integration of omics data across these diverse repositories.
We have developed Gencube, a command-line tool that enables centralized retrieval and integration of a comprehensive set of six different data types-genome assemblies, gene sets, annotations, sequences, comparative genomic data, and NGS-based omics resources-from various leading databases.
Gencube is a free and open-source tool, with its code available on GitHub: https://github.com/snu-cdrc/gencube and also archived on Zenodo: https://doi.org/10.5281/zenodo.14607649.
不同物种的多组学数据量正以前所未有的速度增长,每天都有新的基因组组装、相关注释和高通量测序资源被提交到各种基因组数据存储库。为应对这种数据涌入,现有和新的数据库都在建立优化的层次结构来管理大量信息。然而,缺乏可访问的命令行工具,再加上现有工具的功能限制和不直观的设计,给研究人员带来了重大挑战。这一差距凸显了对一种工具的迫切需求,该工具能够简化跨这些不同存储库的组学数据检索和整合。
我们开发了Gencube,这是一个命令行工具,能够从各种领先数据库中集中检索和整合六种不同类型的综合数据——基因组组装、基因集、注释、序列、比较基因组数据和基于NGS的组学资源。
Gencube是一个免费的开源工具,其代码可在GitHub上获取:https://github.com/snu-cdrc/gencube ,也存档于Zenodo:https://doi.org/10.5281/zenodo.14607649 。