College of Life Sciences, Henan Agricultural University, Zhengzhou, China.
National Key Laboratory of Wheat and Maize Crop Science, Henan Agricultural University, Zhengzhou, China.
Bioinformatics. 2019 Oct 15;35(20):4181-4183. doi: 10.1093/bioinformatics/btz186.
We proposed to store large-scale genotype data as integer sparse matrices, which consumed much fewer computing resources for storage and analysis than traditional approaches. In addition, the raw genotype data could be readily recovered from integer sparse matrices. Utilizing this approach, we stored the genotype data of 1612 Asian cultivated rice accessions and 446 Asian wild rice accessions across 8 584 244 SNP sites in the ECOGEMS database with 310 MB of disk usage. Graphical interface for visualization, analysis and download of SNP data were implemented in ECOGEMS, which made it a valuable resource for rice functional genomic studies.
The code and data of ECOGEMS are freely available at https://github.com/venyao/ECOGEMS. ECOGEMS is deployed at http://ecogems.ncpgr.cn and http://150.109.59.144: 3838/ECOGEMS/ for online use.
Supplementary data are available at Bioinformatics online.
我们提出将大规模基因型数据存储为整数稀疏矩阵,这比传统方法消耗更少的计算资源进行存储和分析。此外,原始基因型数据可以从整数稀疏矩阵中轻松恢复。利用这种方法,我们在 ECOGEMS 数据库中存储了 1612 个亚洲栽培稻品系和 446 个亚洲野生稻品系的基因型数据,跨越 8584244 个 SNP 位点,仅使用 310MB 的磁盘空间。ECOGEMS 实现了用于 SNP 数据可视化、分析和下载的图形界面,使其成为水稻功能基因组研究的有价值资源。
ECOGEMS 的代码和数据可在 https://github.com/venyao/ECOGEMS 上免费获取。ECOGEMS 部署在 http://ecogems.ncpgr.cn 和 http://150.109.59.144:3838/ECOGEMS/ 上,供在线使用。
补充数据可在生物信息学在线获得。