Suppr超能文献

CRAM 3.1:CRAM 文件格式的新进展。

CRAM 3.1: advances in the CRAM file format.

机构信息

Informatics and Digital Solutions, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.

出版信息

Bioinformatics. 2022 Mar 4;38(6):1497-1503. doi: 10.1093/bioinformatics/btac010.

Abstract

MOTIVATION

CRAM has established itself as a high compression alternative to the BAM file format for DNA sequencing data. We describe updates to further improve this on modern sequencing instruments.

RESULTS

With Illumina data CRAM 3.1 is 7-15% smaller than the equivalent CRAM 3.0 file, and 50-70% smaller than the corresponding BAM file. Long-read technology shows more modest compression due to the presence of high-entropy signals.

AVAILABILITY AND IMPLEMENTATION

The CRAM 3.0 specification is freely available from https://samtools.github.io/hts-specs/CRAMv3.pdf. The CRAM 3.1 improvements are available in a separate OpenSource HTScodecs library from https://github.com/samtools/htscodecs, and have been incorporated into HTSlib.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

CRAM 已经成为 DNA 测序数据的 BAM 文件格式的一种高效压缩替代方案。我们描述了进一步改进现代测序仪器上这种方案的更新。

结果

对于 Illumina 数据,CRAM 3.1 比等效的 CRAM 3.0 文件小 7-15%,比相应的 BAM 文件小 50-70%。由于存在高熵信号,长读技术的压缩效果要适度一些。

可用性和实现

CRAM 3.0 规范可从 https://samtools.github.io/hts-specs/CRAMv3.pdf 免费获得。CRAM 3.1 的改进可从 https://github.com/samtools/htscodecs 的单独开源 HTScodecs 库获得,并已被纳入 HTSlib。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88e7/8896640/5bf0614c28eb/btac010f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验