Suppr超能文献

PBSIM:PacBio reads 模拟器——实现更精确的基因组组装。

PBSIM: PacBio reads simulator--toward accurate genome assembly.

机构信息

Information and Mathematical Science and Bioinformatics Co., Ltd., Toshima-ku, Tokyo 170-0013, Japan.

出版信息

Bioinformatics. 2013 Jan 1;29(1):119-21. doi: 10.1093/bioinformatics/bts649. Epub 2012 Nov 4.

Abstract

MOTIVATION

PacBio sequencers produce two types of characteristic reads (continuous long reads: long and high error rate and circular consensus sequencing: short and low error rate), both of which could be useful for de novo assembly of genomes. Currently, there is no available simulator that targets the specific generation of PacBio libraries.

RESULTS

Our analysis of 13 PacBio datasets showed characteristic features of PacBio reads (e.g. the read length of PacBio reads follows a log-normal distribution). We have developed a read simulator, PBSIM, that captures these features using either a model-based or sampling-based method. Using PBSIM, we conducted several hybrid error correction and assembly tests for PacBio reads, suggesting that a continuous long reads coverage depth of at least 15 in combination with a circular consensus sequencing coverage depth of at least 30 achieved extensive assembly results.

AVAILABILITY

PBSIM is freely available from the web under the GNU GPL v2 license (http://code.google.com/p/pbsim/).

摘要

动机

PacBio 测序仪产生两种类型的特征读长(连续长读长:错误率高且长,和环形一致性测序:错误率低且短),这两者都可用于基因组从头组装。目前,还没有针对 PacBio 文库的特定生成的可用模拟器。

结果

我们对 13 个 PacBio 数据集的分析显示了 PacBio 读长的特征(例如,PacBio 读长的读长遵循对数正态分布)。我们开发了一个读长模拟器 PBSIM,它使用基于模型或基于采样的方法来捕获这些特征。使用 PBSIM,我们对 PacBio 读长进行了几次混合纠错和组装测试,表明至少 15 倍的连续长读长覆盖率和至少 30 倍的环形一致性测序覆盖率可以实现广泛的组装结果。

可用性

PBSIM 可在 GNU GPL v2 许可证下(http://code.google.com/p/pbsim/)从网上免费获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验