Suppr超能文献

Semblans:RNA测序数据的自动化组装与处理

Semblans: automated assembly and processing of RNA-seq data.

作者信息

Woodcock-Girard Miles D, Bretz Eric C, Robertson Holly M, Ramanauskas Karolis, Hampton-Marcell Jarrad T, Walker Joseph F

机构信息

Department of Biological Sciences, University of Illinois at Chicago, Chicago, IL 60607, United States.

The Sainsbury Laboratory, University of Cambridge, Cambridge, CB2 1LR, United Kingdom.

出版信息

Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btaf003.

Abstract

MOTIVATION

Recent advancements in parallel sequencing methods have precipitated a surge in publicly available short-read sequence data. This has encouraged the development of novel computational tools for the de novo assembly of transcriptomes from RNA-seq data. Despite the availability of these tools, performing an end-to-end transcriptome assembly remains a programmatically involved task necessitating familiarity with best practices. Aside from quality control steps, including error correction, adapter trimming, and chimera filtration needing to be correctly used, moving data between programs often requires manual reformatting or restructuring, which can further impede throughput. Here, we introduce Semblans, a tool for streamlining the assembly process that efficiently and consistently produces high-quality transcriptome assemblies.

RESULTS

Semblans abstracts the key quality control, reconstitution, and postprocessing steps of transcriptome assembly from raw short-read sequences to annotated coding sequences. Evaluating its performance against previously assembled transcriptomes on the basis of assembly quality, we find that Semblans produced higher quality assemblies for 98 of the 101 short-read runs tested.

AVAILABILITY AND IMPLEMENTATION

Semblans is written in C++ and runs on Unix-compliant operating systems. Source code, documentation, and compiled binaries are hosted under the GNU General Public License at https://github.com/gladshire/Semblans.

摘要

动机

并行测序方法的最新进展促使公开可用的短读长序列数据激增。这推动了用于从RNA测序数据中进行转录组从头组装的新型计算工具的开发。尽管有这些工具,但执行端到端的转录组组装仍然是一项涉及编程的任务,需要熟悉最佳实践。除了需要正确使用包括纠错、接头修剪和嵌合体过滤在内的质量控制步骤外,在程序之间移动数据通常需要手动重新格式化或重组,这可能会进一步阻碍通量。在此,我们介绍Semblans,这是一种用于简化组装过程的工具,可高效且一致地生成高质量的转录组组装。

结果

Semblans将转录组组装的关键质量控制、重构和后处理步骤从原始短读长序列抽象到注释编码序列。根据组装质量对其与先前组装的转录组的性能进行评估,我们发现Semblans在所测试的101个短读长运行中的98个中产生了更高质量的组装。

可用性和实现方式

Semblans用C++编写,在符合Unix的操作系统上运行。源代码、文档和编译后的二进制文件根据GNU通用公共许可证托管在https://github.com/gladshire/Semblans

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76c5/11748423/7d6e6110057e/btaf003f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验