Suppr超能文献

ECOGEMS:利用整数稀疏矩阵高效压缩和检索 2058 份水稻种质资源的 SNP 数据。

ECOGEMS: efficient compression and retrieve of SNP data of 2058 rice accessions with integer sparse matrices.

机构信息

College of Life Sciences, Henan Agricultural University, Zhengzhou, China.

National Key Laboratory of Wheat and Maize Crop Science, Henan Agricultural University, Zhengzhou, China.

出版信息

Bioinformatics. 2019 Oct 15;35(20):4181-4183. doi: 10.1093/bioinformatics/btz186.

Abstract

SUMMARY

We proposed to store large-scale genotype data as integer sparse matrices, which consumed much fewer computing resources for storage and analysis than traditional approaches. In addition, the raw genotype data could be readily recovered from integer sparse matrices. Utilizing this approach, we stored the genotype data of 1612 Asian cultivated rice accessions and 446 Asian wild rice accessions across 8 584 244 SNP sites in the ECOGEMS database with 310 MB of disk usage. Graphical interface for visualization, analysis and download of SNP data were implemented in ECOGEMS, which made it a valuable resource for rice functional genomic studies.

AVAILABILITY AND IMPLEMENTATION

The code and data of ECOGEMS are freely available at https://github.com/venyao/ECOGEMS. ECOGEMS is deployed at http://ecogems.ncpgr.cn and http://150.109.59.144: 3838/ECOGEMS/ for online use.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

摘要

我们提出将大规模基因型数据存储为整数稀疏矩阵,这比传统方法消耗更少的计算资源进行存储和分析。此外,原始基因型数据可以从整数稀疏矩阵中轻松恢复。利用这种方法,我们在 ECOGEMS 数据库中存储了 1612 个亚洲栽培稻品系和 446 个亚洲野生稻品系的基因型数据,跨越 8584244 个 SNP 位点,仅使用 310MB 的磁盘空间。ECOGEMS 实现了用于 SNP 数据可视化、分析和下载的图形界面,使其成为水稻功能基因组研究的有价值资源。

可用性和实现

ECOGEMS 的代码和数据可在 https://github.com/venyao/ECOGEMS 上免费获取。ECOGEMS 部署在 http://ecogems.ncpgr.cnhttp://150.109.59.144:3838/ECOGEMS/ 上,供在线使用。

补充信息

补充数据可在生物信息学在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验