Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA.
Bioinformatics. 2021 Nov 18;37(22):4248-4250. doi: 10.1093/bioinformatics/btab378.
The sparse allele vectors file format is an efficient storage format for large-scale DNA variation data and is designed for high throughput association analysis by leveraging techniques for fast deserialization of data into computer memory. A command line interface has been developed to complement the storage format and supports basic features like importing, exporting and subsetting. Additionally, a C++ programming API is available allowing for easy integration into analysis software.
https://github.com/statgen/savvy.
Supplementary data are available at Bioinformatics online.
稀疏等位基因向量文件格式是一种用于大型 DNA 变异数据的高效存储格式,通过利用将数据快速反序列化到计算机内存中的技术,专门为高通量关联分析而设计。已经开发了一个命令行接口来补充存储格式,并支持导入、导出和子集等基本功能。此外,还提供了一个 C++编程 API,允许轻松集成到分析软件中。
https://github.com/statgen/savvy。
补充数据可在Bioinformatics 在线获得。