• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

FAAST:流空间辅助对准搜索工具。

FAAST: Flow-space Assisted Alignment Search Tool.

机构信息

IFM Bioinformatics and SeRC (Swedish e-Science Research Centre), Linköping University, S-581 83 Linköping, Sweden.

出版信息

BMC Bioinformatics. 2011 Jul 19;12:293. doi: 10.1186/1471-2105-12-293.

DOI:10.1186/1471-2105-12-293
PMID:21771335
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3228549/
Abstract

BACKGROUND

High throughput pyrosequencing (454 sequencing) is the major sequencing platform for producing long read high throughput data. While most other sequencing techniques produce reading errors mainly comparable with substitutions, pyrosequencing produce errors mainly comparable with gaps. These errors are less efficiently detected by most conventional alignment programs and may produce inaccurate alignments.

RESULTS

We suggest a novel algorithm for calculating the optimal local alignment which utilises flowpeak information in order to improve alignment accuracy. Flowpeak information can be retained from a 454 sequencing run through interpretation of the binary SFF-file format. This novel algorithm has been implemented in a program named FAAST (Flow-space Assisted Alignment Search Tool).

CONCLUSIONS

We present and discuss the results of simulations that show that FAAST, through the use of the novel algorithm, can gain several percentage points of accuracy compared to Smith-Waterman-Gotoh alignments, depending on the 454 data quality. Furthermore, through an efficient multi-thread aware implementation, FAAST is able to perform these high quality alignments at high speed. The tool is available at http://www.ifm.liu.se/bioinfo/

摘要

背景

高通量焦磷酸测序(454 测序)是产生长读高通量数据的主要测序平台。虽然大多数其他测序技术产生的阅读错误主要与替换相当,但焦磷酸测序产生的错误主要与间隙相当。这些错误在大多数常规对齐程序中检测效率较低,可能会产生不准确的对齐。

结果

我们建议了一种新的算法,用于计算最优局部对齐,该算法利用流峰信息来提高对齐精度。可以通过解释二进制 SFF 文件格式来从 454 测序运行中保留流峰信息。这种新算法已在名为 FAAST(Flow-space Assisted Alignment Search Tool)的程序中实现。

结论

我们展示并讨论了模拟结果,表明 FAAST 通过使用新算法,可以比 Smith-Waterman-Gotoh 对齐获得几个百分点的准确性,具体取决于 454 数据质量。此外,通过高效的多线程感知实现,FAAST 能够以高速进行这些高质量的对齐。该工具可在 http://www.ifm.liu.se/bioinfo/ 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/93721d1d83b3/1471-2105-12-293-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/9e98b8ab0448/1471-2105-12-293-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/8227a03fd58b/1471-2105-12-293-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/054e7efe084e/1471-2105-12-293-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/ecd74c79821c/1471-2105-12-293-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/93721d1d83b3/1471-2105-12-293-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/9e98b8ab0448/1471-2105-12-293-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/8227a03fd58b/1471-2105-12-293-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/054e7efe084e/1471-2105-12-293-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/ecd74c79821c/1471-2105-12-293-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0c5/3228549/93721d1d83b3/1471-2105-12-293-5.jpg

相似文献

1
FAAST: Flow-space Assisted Alignment Search Tool.FAAST:流空间辅助对准搜索工具。
BMC Bioinformatics. 2011 Jul 19;12:293. doi: 10.1186/1471-2105-12-293.
2
Highly improved homopolymer aware nucleotide-protein alignments with 454 data.使用 454 数据进行高度改进的同源聚合物识别核苷酸-蛋白质比对。
BMC Bioinformatics. 2012 Sep 12;13:230. doi: 10.1186/1471-2105-13-230.
3
Efficient alignment of pyrosequencing reads for re-sequencing applications.用于重测序应用的焦磷酸测序reads 的高效比对。
BMC Bioinformatics. 2011 May 16;12:163. doi: 10.1186/1471-2105-12-163.
4
A novel partial sequence alignment tool for finding large deletions.一种用于查找大片段缺失的新型局部序列比对工具。
ScientificWorldJournal. 2012;2012:694813. doi: 10.1100/2012/694813. Epub 2012 Apr 1.
5
Introducing difference recurrence relations for faster semi-global alignment of long sequences.引入差异递归关系以加快长序列的半全局比对。
BMC Bioinformatics. 2018 Feb 19;19(Suppl 1):45. doi: 10.1186/s12859-018-2014-8.
6
Phylogeny-Aware Alignment with PRANK and PAGAN.使用PRANK和PAGAN进行系统发育感知比对。
Methods Mol Biol. 2021;2231:17-37. doi: 10.1007/978-1-0716-1036-7_2.
7
BFAST: an alignment tool for large scale genome resequencing.BFAST:用于大规模基因组重测序的比对工具。
PLoS One. 2009 Nov 11;4(11):e7767. doi: 10.1371/journal.pone.0007767.
8
The Sequence Alignment/Map format and SAMtools.序列比对/映射格式和 SAMtools。
Bioinformatics. 2009 Aug 15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352. Epub 2009 Jun 8.
9
Multiple sequence alignment using an exhaustive and greedy algorithm.使用穷举和贪心算法进行多序列比对。
J Bioinform Comput Biol. 2005 Apr;3(2):243-55. doi: 10.1142/s021972000500103x.
10
GASSST: global alignment short sequence search tool.GASSST:全局比对短序列搜索工具。
Bioinformatics. 2010 Oct 15;26(20):2534-40. doi: 10.1093/bioinformatics/btq485. Epub 2010 Aug 24.

引用本文的文献

1
The performance of homopolymer detection using dichromatic and tetrachromatic fluorogenic next-generation sequencing platforms.使用双色和四色荧光高通量测序平台进行同源聚合物检测的性能。
BMC Genomics. 2024 May 31;25(1):542. doi: 10.1186/s12864-024-10474-0.
2
Improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies.提高基于半导体的测序技术在同聚物区域的比对准确性。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):521. doi: 10.1186/s12864-016-2894-9.
3
Using state machines to model the Ion Torrent sequencing process and to improve read error rates.

本文引用的文献

1
Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate.肠贾第虫组合 E 分离株的基因组分析和比较基因组学。
BMC Genomics. 2010 Oct 7;11:543. doi: 10.1186/1471-2164-11-543.
2
Characteristics of 454 pyrosequencing data--enabling realistic simulation with flowsim.454 焦磷酸测序数据的特征——使用 flowsim 进行现实模拟。
Bioinformatics. 2010 Sep 15;26(18):i420-5. doi: 10.1093/bioinformatics/btq365.
3
PanGEA: identification of allele specific gene expression using the 454 technology.泛基因组等位基因特异性基因表达分析:利用454技术进行鉴定
使用状态机对 Ion Torrent 测序过程进行建模,并提高读取错误率。
Bioinformatics. 2013 Jul 1;29(13):i344-51. doi: 10.1093/bioinformatics/btt212.
4
Computational tools for viral metagenomics and their application in clinical research.病毒宏基因组学的计算工具及其在临床研究中的应用。
Virology. 2012 Dec 20;434(2):162-74. doi: 10.1016/j.virol.2012.09.025. Epub 2012 Oct 11.
5
Highly improved homopolymer aware nucleotide-protein alignments with 454 data.使用 454 数据进行高度改进的同源聚合物识别核苷酸-蛋白质比对。
BMC Bioinformatics. 2012 Sep 12;13:230. doi: 10.1186/1471-2105-13-230.
BMC Bioinformatics. 2009 May 14;10:143. doi: 10.1186/1471-2105-10-143.
4
The Genome Sequencer FLX System--longer reads, more applications, straight forward bioinformatics and more complete data sets.基因组测序仪FLX系统——读长更长、应用更多、生物信息学简单直接且数据集更完整。
J Biotechnol. 2008 Aug 31;136(1-2):3-10. doi: 10.1016/j.jbiotec.2008.03.021. Epub 2008 Jun 21.
5
A probabilistic method for small RNA flowgram matching.一种用于小RNA测序峰图匹配的概率方法。
Pac Symp Biocomput. 2008:75-86.
6
Genome sequencing in microfabricated high-density picolitre reactors.微制造高密度皮升反应器中的基因组测序
Nature. 2005 Sep 15;437(7057):376-80. doi: 10.1038/nature03959. Epub 2005 Jul 31.
7
SSAHA: a fast search method for large DNA databases.SSAHA:一种用于大型DNA数据库的快速搜索方法。
Genome Res. 2001 Oct;11(10):1725-9. doi: 10.1101/gr.194201.
8
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.空位BLAST和位置特异性迭代BLAST:新一代蛋白质数据库搜索程序。
Nucleic Acids Res. 1997 Sep 1;25(17):3389-402. doi: 10.1093/nar/25.17.3389.
9
Identification of common molecular subsequences.常见分子子序列的鉴定
J Mol Biol. 1981 Mar 25;147(1):195-7. doi: 10.1016/0022-2836(81)90087-5.
10
An improved algorithm for matching biological sequences.一种用于匹配生物序列的改进算法。
J Mol Biol. 1982 Dec 15;162(3):705-8. doi: 10.1016/0022-2836(82)90398-9.