Roberson Elisha D O
Washington University in St. Louis, Departments of Medicine & Genetics, Division of Rheumatology, St. Louis, MO 63110.
bioRxiv. 2025 Mar 20:2025.03.18.643355. doi: 10.1101/2025.03.18.643355.
Translational research is often a collaborative enterprise that involves basic science researchers, clinicians, and experts in genomics and bioinformatics. While there are central university and industry cores to support data generation, long-term storage often falls to the individual investigators. We frequently fulfill the role of long-term FASTQ file storage for our collaborators. To reduce our cold storage space, we tested the space savings for gzip and zstandard algorithms on an old set of FASTQ files. We found that zstandard had a better overall compression ratio than the best gzip algorithm, amounting to more than 20% space savings overall compared to gzip. It may be worth transitioning to zstandard compression for small, collaborative genomics labs to minimize cold storage costs.
转化研究通常是一项协作性工作,涉及基础科学研究人员、临床医生以及基因组学和生物信息学专家。虽然有核心的大学和行业机构来支持数据生成,但长期存储往往由各个研究人员负责。我们经常为合作伙伴承担长期存储FASTQ文件的任务。为了减少我们的冷存储空间,我们在一组旧的FASTQ文件上测试了gzip和zstandard算法的空间节省情况。我们发现,zstandard的总体压缩率比最佳的gzip算法更好,与gzip相比,总体节省了超过20%的空间。对于小型协作基因组学实验室来说,过渡到zstandard压缩可能值得,以尽量降低冷存储成本。