• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SeedsGraph:一种用于下一代测序数据的高效组装器。

SeedsGraph: an efficient assembler for next-generation sequencing data.

作者信息

Wang Chunyu, Guo Maozu, Liu Xiaoyan, Liu Yang, Zou Quan

出版信息

BMC Med Genomics. 2015;8 Suppl 2(Suppl 2):S13. doi: 10.1186/1755-8794-8-S2-S13. Epub 2015 May 29.

DOI:10.1186/1755-8794-8-S2-S13
PMID:26044652
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4460749/
Abstract

DNA sequencing technology has been rapidly evolving, and produces a large number of short reads with a fast rising tendency. This has led to a resurgence of research in whole genome shotgun assembly algorithms. We start the assembly algorithm by clustering the short reads in a cloud computing framework, and the clustering process groups fragments according to their original consensus long-sequence similarity. We condense each group of reads to a chain of seeds, which is a kind of substring with reads aligned, and then build a graph accordingly. Finally, we analyze the graph to find Euler paths, and assemble the reads related in the paths into contigs, and then lay out contigs with mate-pair information for scaffolds. The result shows that our algorithm is efficient and feasible for a large set of reads such as in next-generation sequencing technology.

摘要

DNA测序技术一直在迅速发展,并产生了大量呈快速增长趋势的短读段。这导致了对全基因组鸟枪法组装算法研究的复兴。我们通过在云计算框架中对短读段进行聚类来启动组装算法,聚类过程根据它们原始的共有长序列相似性对片段进行分组。我们将每组读段压缩成种子链,种子链是一种读段对齐的子串,然后据此构建一个图。最后,我们分析该图以找到欧拉路径,并将路径中相关的读段组装成重叠群,然后利用配对信息对重叠群进行布局以构建支架。结果表明,我们的算法对于诸如下一代测序技术中的大量读段来说是高效且可行的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c823/4460749/2b17cbdbc35f/1755-8794-8-S2-S13-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c823/4460749/2f3ea6925b77/1755-8794-8-S2-S13-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c823/4460749/ad3c2fc184b9/1755-8794-8-S2-S13-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c823/4460749/8f9daa76114e/1755-8794-8-S2-S13-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c823/4460749/2b17cbdbc35f/1755-8794-8-S2-S13-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c823/4460749/2f3ea6925b77/1755-8794-8-S2-S13-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c823/4460749/ad3c2fc184b9/1755-8794-8-S2-S13-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c823/4460749/8f9daa76114e/1755-8794-8-S2-S13-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c823/4460749/2b17cbdbc35f/1755-8794-8-S2-S13-4.jpg

相似文献

1
SeedsGraph: an efficient assembler for next-generation sequencing data.SeedsGraph:一种用于下一代测序数据的高效组装器。
BMC Med Genomics. 2015;8 Suppl 2(Suppl 2):S13. doi: 10.1186/1755-8794-8-S2-S13. Epub 2015 May 29.
2
GapFiller: a de novo assembly approach to fill the gap within paired reads.GapFiller:一种从头开始的组装方法,用于填补配对读取中的缺口。
BMC Bioinformatics. 2012;13 Suppl 14(Suppl 14):S8. doi: 10.1186/1471-2105-13-S14-S8. Epub 2012 Sep 7.
3
ScaffMatch: scaffolding algorithm based on maximum weight matching.ScaffMatch:基于最大权重匹配的支架算法。
Bioinformatics. 2015 Aug 15;31(16):2632-8. doi: 10.1093/bioinformatics/btv211. Epub 2015 Apr 17.
4
BASE: a practical de novo assembler for large genomes using long NGS reads.BASE:一种使用长读长二代测序数据进行大型基因组从头组装的实用工具。
BMC Genomics. 2016 Aug 31;17 Suppl 5(Suppl 5):499. doi: 10.1186/s12864-016-2829-5.
5
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.单轮循环器:从短读长和长读长测序数据中解析细菌基因组组装结果
PLoS Comput Biol. 2017 Jun 8;13(6):e1005595. doi: 10.1371/journal.pcbi.1005595. eCollection 2017 Jun.
6
An efficient and scalable graph modeling approach for capturing information at different levels in next generation sequencing reads.一种高效且可扩展的图模型构建方法,用于捕获下一代测序读段中不同层次的信息。
BMC Bioinformatics. 2013;14 Suppl 11(Suppl 11):S7. doi: 10.1186/1471-2105-14-S11-S7. Epub 2013 Nov 4.
7
HISEA: HIerarchical SEed Aligner for PacBio data.HISEA:用于PacBio数据的分层种子比对器。
BMC Bioinformatics. 2017 Dec 19;18(1):564. doi: 10.1186/s12859-017-1953-9.
8
LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.LightAssembler:一种用于高通量测序reads 的快速且节省内存的组装算法。
Bioinformatics. 2016 Nov 1;32(21):3215-3223. doi: 10.1093/bioinformatics/btw470. Epub 2016 Jul 13.
9
GAM-NGS: genomic assemblies merger for next generation sequencing.GAM-NGS:用于下一代测序的基因组组装合并。
BMC Bioinformatics. 2013;14 Suppl 7(Suppl 7):S6. doi: 10.1186/1471-2105-14-S7-S6. Epub 2013 Apr 22.
10
Alignment of Next-Generation Sequencing Reads.下一代测序读数的比对
Annu Rev Genomics Hum Genet. 2015;16:133-51. doi: 10.1146/annurev-genom-090413-025358. Epub 2015 May 4.

引用本文的文献

1
Connecting the dots in translational bioinformatics: TBC 2014 collection.连接转化生物信息学中的各个环节:2014年转化生物信息学大会论文集
BMC Med Genomics. 2015;8 Suppl 2(Suppl 2):I1. doi: 10.1186/1755-8794-8-S2-I1. Epub 2015 May 29.

本文引用的文献

1
Readjoiner: a fast and memory efficient string graph-based sequence assembler.Readjoiner:一种快速且内存高效的基于字符串图的序列拼接器。
BMC Bioinformatics. 2012 May 6;13:82. doi: 10.1186/1471-2105-13-82.
2
Efficient de novo assembly of large genomes using compressed data structures.利用压缩数据结构进行高效的从头基因组组装。
Genome Res. 2012 Mar;22(3):549-56. doi: 10.1101/gr.126953.111. Epub 2011 Dec 7.
3
GAGE: A critical evaluation of genome assemblies and assembly algorithms.盖奇:基因组组装和算法的关键评估。
Genome Res. 2012 Mar;22(3):557-67. doi: 10.1101/gr.131383.111. Epub 2012 Jan 6.
4
The sequence read archive.序列读取存档库。
Nucleic Acids Res. 2011 Jan;39(Database issue):D19-21. doi: 10.1093/nar/gkq1019. Epub 2010 Nov 9.
5
The case for cloud computing in genome informatics.云计算在基因组信息学中的应用。
Genome Biol. 2010;11(5):207. doi: 10.1186/gb-2010-11-5-207. Epub 2010 May 5.
6
Assembly algorithms for next-generation sequencing data.下一代测序数据的组装算法。
Genomics. 2010 Jun;95(6):315-27. doi: 10.1016/j.ygeno.2010.03.001. Epub 2010 Mar 6.
7
Cloud computing.云计算
Bioinformatics. 2009 Jun 15;25(12):1475. doi: 10.1093/bioinformatics/btp274. Epub 2009 May 12.
8
ABySS: a parallel assembler for short read sequence data.ABySS:一种用于短读长序列数据的并行汇编器。
Genome Res. 2009 Jun;19(6):1117-23. doi: 10.1101/gr.089532.108. Epub 2009 Feb 27.
9
A new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes.一种计算K-mer频率的新方法及其在大型重复植物基因组注释中的应用。
BMC Genomics. 2008 Oct 31;9:517. doi: 10.1186/1471-2164-9-517.
10
Velvet: algorithms for de novo short read assembly using de Bruijn graphs.《天鹅绒:使用德布鲁因图进行从头短读长拼接的算法》
Genome Res. 2008 May;18(5):821-9. doi: 10.1101/gr.074492.107. Epub 2008 Mar 18.