Suppr超能文献

一种使用可配置统计模型的454数据高效模拟器。

An efficient simulator of 454 data using configurable statistical models.

作者信息

Lysholm Fredrik, Andersson Björn, Persson Bengt

机构信息

IFM Bioinformatics and SeRC (Swedish e-Science Research Centre), Linköping University, S-581 83 Linköping, Sweden.

出版信息

BMC Res Notes. 2011 Oct 26;4:449. doi: 10.1186/1756-0500-4-449.

Abstract

BACKGROUND

Roche 454 is one of the major 2nd generation sequencing platforms. The particular characteristics of 454 sequence data pose new challenges for bioinformatic analyses, e.g. assembly and alignment search algorithms. Simulation of these data is therefore useful, in order to further assess how bioinformatic applications and algorithms handle 454 data.

FINDINGS

We developed a new application named 454sim for simulation of 454 data at high speed and accuracy. The program is multi-thread capable and is available as C++ source code or pre-compiled binaries. Sequence reads are simulated by 454sim using a set of statistical models for each chemistry. 454sim simulates recorded peak intensities, peak quality deterioration and it calculates quality values. All three generations of the Roche 454 chemistry ('GS20', 'GS FLX' and 'Titanium') are supported and defined in external text files for easy access and tweaking.

CONCLUSIONS

We present a new platform independent application named 454sim. 454sim is generally 200 times faster compared to previous programs and it allows for simple adjustments of the statistical models. These improvements make it possible to carry out more complex and rigorous algorithm evaluations in a reasonable time scale.

摘要

背景

罗氏454是主要的第二代测序平台之一。454序列数据的特殊特性给生物信息学分析带来了新挑战,例如组装和比对搜索算法。因此,模拟这些数据有助于进一步评估生物信息学应用程序和算法如何处理454数据。

研究结果

我们开发了一个名为454sim的新应用程序,用于高速、准确地模拟454数据。该程序支持多线程,可作为C++源代码或预编译二进制文件使用。454sim使用针对每种化学方法的一组统计模型来模拟序列读数。454sim模拟记录的峰强度、峰质量劣化并计算质量值。罗氏454化学方法的所有三代(“GS20”、“GS FLX”和“Titanium”)均得到支持,并在外部文本文件中定义,以便于访问和调整。

结论

我们展示了一个名为454sim的新的独立于平台的应用程序。454sim通常比以前的程序快200倍,并且允许对统计模型进行简单调整。这些改进使得在合理的时间范围内进行更复杂、更严格的算法评估成为可能。

相似文献

5
BOAT: Basic Oligonucleotide Alignment Tool.BOAT:基本寡核苷酸比对工具。
BMC Genomics. 2009 Dec 3;10 Suppl 3(Suppl 3):S2. doi: 10.1186/1471-2164-10-S3-S2.
8
FAAST: Flow-space Assisted Alignment Search Tool.FAAST:流空间辅助对准搜索工具。
BMC Bioinformatics. 2011 Jul 19;12:293. doi: 10.1186/1471-2105-12-293.
9
In search of perfect reads.寻找完美的读数。
BMC Bioinformatics. 2015;16 Suppl 17(Suppl 17):S7. doi: 10.1186/1471-2105-16-S17-S7. Epub 2015 Dec 7.

引用本文的文献

1
A broad survey of DNA sequence data simulation tools.DNA 序列数据模拟工具的广泛调查。
Brief Funct Genomics. 2020 Jan 22;19(1):49-59. doi: 10.1093/bfgp/elz033.
6
A better sequence-read simulator program for metagenomics.一个更好的宏基因组学序列读取模拟程序。
BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S14. doi: 10.1186/1471-2105-15-S9-S14. Epub 2014 Sep 10.
9
A comparison of methods for clustering 16S rRNA sequences into OTUs.16S rRNA 序列聚类成 OTUs 的方法比较。
PLoS One. 2013 Aug 13;8(8):e70837. doi: 10.1371/journal.pone.0070837. eCollection 2013.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验