Suppr超能文献

TKSM:高度模块化、用户可自定义和可扩展的转录组测序长读模拟程序。

TKSM: highly modular, user-customizable, and scalable transcriptomic sequencing long-read simulator.

机构信息

Computing Science Department, Simon Fraser University, Burnaby, BC V5A 1S6, Canada.

Department of Computer Science, the University of British Columbia, Vancouver, BC V6T 1Z4, Canada.

出版信息

Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae051.

Abstract

MOTIVATION

Transcriptomic long-read (LR) sequencing is an increasingly cost-effective technology for probing various RNA features. Numerous tools have been developed to tackle various transcriptomic sequencing tasks (e.g. isoform and gene fusion detection). However, the lack of abundant gold-standard datasets hinders the benchmarking of such tools. Therefore, the simulation of LR sequencing is an important and practical alternative. While the existing LR simulators aim to imitate the sequencing machine noise and to target specific library protocols, they lack some important library preparation steps (e.g. PCR) and are difficult to modify to new and changing library preparation techniques (e.g. single-cell LRs).

RESULTS

We present TKSM, a modular and scalable LR simulator, designed so that each RNA modification step is targeted explicitly by a specific module. This allows the user to assemble a simulation pipeline as a combination of TKSM modules to emulate a specific sequencing design. Additionally, the input/output of all the core modules of TKSM follows the same simple format (Molecule Description Format) allowing the user to easily extend TKSM with new modules targeting new library preparation steps.

AVAILABILITY AND IMPLEMENTATION

TKSM is available as an open source software at https://github.com/vpc-ccg/tksm.

摘要

动机

转录组长读(LR)测序是一种越来越具成本效益的技术,可用于探测各种 RNA 特征。已经开发了许多工具来解决各种转录组测序任务(例如,异构体和基因融合检测)。然而,缺乏丰富的黄金标准数据集阻碍了这些工具的基准测试。因此,LR 测序的模拟是一种重要且实用的替代方法。虽然现有的 LR 模拟器旨在模仿测序机器的噪声并针对特定的文库协议,但它们缺乏一些重要的文库制备步骤(例如 PCR),并且难以针对新的和不断变化的文库制备技术(例如单细胞 LR)进行修改。

结果

我们提出了 TKSM,这是一种模块化和可扩展的 LR 模拟器,其设计方式使每个 RNA 修饰步骤都由特定的模块明确针对。这允许用户通过 TKSM 模块的组合组装模拟管道,以模拟特定的测序设计。此外,TKSM 的所有核心模块的输入/输出都遵循相同的简单格式(分子描述格式),允许用户使用针对新文库制备步骤的新模块轻松扩展 TKSM。

可用性和实现

TKSM 可作为开源软件在 https://github.com/vpc-ccg/tksm 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62aa/10868325/76478bf60494/btae051f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验