Department of Bioengineering, University of California, Berkeley, CA, 94720, United States.
Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, 518055, China.
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae168.
Understanding the structure of sequenced fragments from genomics libraries is essential for accurate read preprocessing. Currently, different assays and sequencing technologies require custom scripts and programs that do not leverage the common structure of sequence elements present in genomics libraries.
We present seqspec, a machine-readable specification for libraries produced by genomics assays that facilitates standardization of preprocessing and enables tracking and comparison of genomics assays.
The specification and associated seqspec command line tool is available at https://www.doi.org/10.5281/zenodo.10213865.
理解基因组文库中测序片段的结构对于准确的读预处理至关重要。目前,不同的检测和测序技术需要定制的脚本和程序,而这些脚本和程序无法利用基因组文库中存在的序列元素的通用结构。
我们提出了 seqspec,这是一种针对基因组学检测产生的文库的机器可读规范,它促进了预处理的标准化,并能够跟踪和比较基因组学检测。
规范和相关的 seqspec 命令行工具可在 https://www.doi.org/10.5281/zenodo.10213865 获得。