使用分割码对技术序列进行灵活解析、解释和编辑。

Flexible parsing, interpretation, and editing of technical sequences with splitcode.

作者信息

Sullivan Delaney K, Pachter Lior

机构信息

UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.

Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA.

出版信息

bioRxiv. 2023 Dec 9:2023.03.20.533521. doi: 10.1101/2023.03.20.533521.

DOI:10.1101/2023.03.20.533521

PMID:36993532

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10055216/

Abstract

Next-generation sequencing libraries are constructed with numerous synthetic constructs such as sequencing adapters, barcodes, and unique molecular identifiers. Such sequences can be essential for interpreting results of sequencing assays, and when they contain information pertinent to an experiment, they must be processed and analyzed. We present a tool called splitcode, that enables flexible and efficient parsing, interpreting, and editing of sequencing reads. This versatile tool facilitates simple, reproducible preprocessing of reads from libraries constructed for a large array of single-cell and bulk sequencing assays.

摘要

下一代测序文库是用众多合成构建体构建的，如测序接头、条形码和独特分子标识符。这些序列对于解读测序分析结果可能至关重要，并且当它们包含与实验相关的信息时，必须对其进行处理和分析。我们提出了一种名为splitcode的工具，它能够灵活、高效地解析、解读和编辑测序读数。这个多功能工具便于对为大量单细胞和批量测序分析构建的文库中的读数进行简单、可重复的预处理。