Tian Wei, Ding Wubin, Shen Jiawei, Li Daofeng, Wang Ting, Ecker Joseph R
Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA.
Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA.
bioRxiv. 2024 May 15:2023.09.22.559047. doi: 10.1101/2023.09.22.559047.
With single-cell DNA methylation studies yielding vast datasets, existing data formats struggle with the unique challenges of storage and efficient operations, highlighting a need for improved solutions.
BAllC (Binary All Cytosines) emerges as a tailored binary format for methylation data, addressing these challenges. BAllCools, its complementary software toolkit, enhances parsing, indexing, and querying capabilities, promising superior operational speeds and reduced storage needs.
随着单细胞DNA甲基化研究产生大量数据集,现有的数据格式在存储和高效操作的独特挑战面前捉襟见肘,凸显了对改进解决方案的需求。
BAllC(二进制全胞嘧啶)作为一种针对甲基化数据量身定制的二进制格式应运而生,应对了这些挑战。其配套软件工具包BAllCools增强了解析、索引和查询功能,有望实现更高的操作速度并减少存储需求。