Suppr超能文献

Dupsifter:一种用于全基因组亚硫酸氢盐测序的轻量级重复标记工具。

Dupsifter: a lightweight duplicate marking tool for whole genome bisulfite sequencing.

机构信息

Department of Epigenetics, Van Andel Institute, Grand Rapids, MI 49503, United States.

Center for Computational and Genomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, United States.

出版信息

Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad729.

Abstract

SUMMARY

In whole genome sequencing data, polymerase chain reaction amplification results in duplicate DNA fragments coming from the same location in the genome. The process of preparing a whole genome bisulfite sequencing (WGBS) library, on the other hand, can create two DNA fragments from the same location that should not be considered duplicates. Currently, only one WGBS-aware duplicate marking tool exists. However, it only works with the output from a single tool, does not accept streaming input or output, and requires a substantial amount of memory relative to the input size. Dupsifter provides an aligner-agnostic duplicate marking tool that is lightweight, has streaming capabilities, and is memory efficient.

AVAILABILITY AND IMPLEMENTATION

Source code and binaries are freely available at https://github.com/huishenlab/dupsifter under the MIT license. Dupsifter is implemented in C and is supported on macOS and Linux.

摘要

摘要

在全基因组测序数据中,聚合酶链反应(PCR)扩增会导致来自基因组中同一位置的重复 DNA 片段。另一方面,全基因组亚硫酸氢盐测序(WGBS)文库的制备过程可以从同一位置产生两个不应被视为重复的 DNA 片段。目前,仅存在一个专门用于 WGBS 的重复标记工具。然而,它仅适用于单个工具的输出,不接受流输入或输出,并且相对于输入大小需要大量的内存。Dupsifter 提供了一种与对齐器无关的重复标记工具,它轻量级、具有流处理能力且内存效率高。

可及性和实现

源代码和二进制文件可在 MIT 许可证下在 https://github.com/huishenlab/dupsifter 上免费获得。Dupsifter 是用 C 语言实现的,支持 macOS 和 Linux。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/788b/10724848/c2b8c22e83e8/btad729f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验