Suppr超能文献

Clover:一款面向聚类的 Illumina 序列从头组装程序。

Clover: a clustering-oriented de novo assembler for Illumina sequences.

机构信息

Department of Computer Science, National Tsing Hua University, Hsinchu, 30013, Taiwan.

Department of Computer Science and Information Engineering, Providence University, Taichung, 43301, Taiwan.

出版信息

BMC Bioinformatics. 2020 Nov 17;21(1):528. doi: 10.1186/s12859-020-03788-9.

Abstract

BACKGROUND

Next-generation sequencing technologies revolutionized genomics by producing high-throughput reads at low cost, and this progress has prompted the recent development of de novo assemblers. Multiple assembly methods based on de Bruijn graph have been shown to be efficient for Illumina reads. However, the sequencing errors generated by the sequencer complicate analysis of de novo assembly and influence the quality of downstream genomic researches.

RESULTS

In this paper, we develop a de Bruijn assembler, called Clover (clustering-oriented de novo assembler), that utilizes a novel k-mer clustering approach from the overlap-layout-consensus concept to deal with the sequencing errors generated by the Illumina platform. We further evaluate Clover's performance against several de Bruijn graph assemblers (ABySS, SOAPdenovo, SPAdes and Velvet), overlap-layout-consensus assemblers (Bambus2, CABOG and MSR-CA) and string graph assembler (SGA) on three datasets (Staphylococcus aureus, Rhodobacter sphaeroides and human chromosome 14). The results show that Clover achieves a superior assembly quality in terms of corrected N50 and E-size while remaining a significantly competitive in run time except SOAPdenovo. In addition, Clover was involved in the sequencing projects of bacterial genomes Acinetobacter baumannii TYTH-1 and Morganella morganii KT.

CONCLUSIONS

The marvel clustering-based approach of Clover that integrates the flexibility of the overlap-layout-consensus approach and the efficiency of the de Bruijn graph method has high potential on de novo assembly. Now, Clover is freely available as open source software from https://oz.nthu.edu.tw/~d9562563/src.html .

摘要

背景

下一代测序技术通过以低成本产生高通量读数彻底改变了基因组学,这一进展促使了新的从头组装器的发展。基于 de Bruijn 图的多种组装方法已被证明对 Illumina 读数有效。然而,测序仪产生的测序错误使从头组装的分析变得复杂,并影响下游基因组研究的质量。

结果

在本文中,我们开发了一种称为 Clover(基于聚类的从头组装器)的 de Bruijn 组装器,它利用了一种新颖的基于重叠布局一致的 k-mer 聚类方法来处理由 Illumina 平台产生的测序错误。我们进一步在三个数据集(金黄色葡萄球菌、球形红杆菌和人类 14 号染色体)上,将 Clover 的性能与几种 de Bruijn 图组装器(ABySS、SOAPdenovo、SPAdes 和 Velvet)、重叠布局一致组装器(Bambus2、CABOG 和 MSR-CA)和字符串图组装器(SGA)进行了评估。结果表明,Clover 在校正后的 N50 和 E-size 方面实现了卓越的组装质量,同时在运行时间方面保持了明显的竞争力,除了 SOAPdenovo 之外。此外,Clover 还参与了细菌基因组 Acinetobacter baumannii TYTH-1 和 Morganella morganii KT 的测序项目。

结论

Clover 的基于聚类的 marvel 方法集成了重叠布局一致方法的灵活性和 de Bruijn 图方法的效率,在从头组装方面具有很高的潜力。现在,Clover 可以从 https://oz.nthu.edu.tw/~d9562563/src.html 免费获得开源软件。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a02/7672897/cf9bb79fee61/12859_2020_3788_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验