• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用短读长组装原核基因组的复杂性。

Assembly complexity of prokaryotic genomes using short reads.

机构信息

Department of Computer Science, Institute for Advanced Computer Studies, University of Maryland, College Park, MD, USA.

出版信息

BMC Bioinformatics. 2010 Jan 12;11:21. doi: 10.1186/1471-2105-11-21.

DOI:10.1186/1471-2105-11-21
PMID:20064276
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2821320/
Abstract

BACKGROUND

De Bruijn graphs are a theoretical framework underlying several modern genome assembly programs, especially those that deal with very short reads. We describe an application of de Bruijn graphs to analyze the global repeat structure of prokaryotic genomes.

RESULTS

We provide the first survey of the repeat structure of a large number of genomes. The analysis gives an upper-bound on the performance of genome assemblers for de novo reconstruction of genomes across a wide range of read lengths. Further, we demonstrate that the majority of genes in prokaryotic genomes can be reconstructed uniquely using very short reads even if the genomes themselves cannot. The non-reconstructible genes are overwhelmingly related to mobile elements (transposons, IS elements, and prophages).

CONCLUSIONS

Our results improve upon previous studies on the feasibility of assembly with short reads and provide a comprehensive benchmark against which to compare the performance of the short-read assemblers currently being developed.

摘要

背景

De Bruijn 图是几个现代基因组组装程序的理论基础,特别是那些处理非常短的读取的程序。我们描述了一种应用 De Bruijn 图来分析原核基因组全局重复结构的方法。

结果

我们首次对大量基因组的重复结构进行了调查。该分析为基因组组装器在广泛的读取长度范围内从头重建基因组的性能提供了上限。此外,我们证明,即使基因组本身无法重建,使用非常短的读取也可以唯一地重建大多数原核基因组中的基因。无法重建的基因绝大多数与移动元件(转座子、IS 元件和噬菌体)有关。

结论

我们的结果改进了以前关于使用短读取进行组装的可行性的研究,并提供了一个全面的基准,以比较当前正在开发的短读取组装器的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84e1/2821320/e03ee3405234/1471-2105-11-21-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84e1/2821320/71903c0d205b/1471-2105-11-21-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84e1/2821320/372a4bd9d62e/1471-2105-11-21-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84e1/2821320/54dc880c643e/1471-2105-11-21-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84e1/2821320/e03ee3405234/1471-2105-11-21-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84e1/2821320/71903c0d205b/1471-2105-11-21-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84e1/2821320/372a4bd9d62e/1471-2105-11-21-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84e1/2821320/54dc880c643e/1471-2105-11-21-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84e1/2821320/e03ee3405234/1471-2105-11-21-4.jpg

相似文献

1
Assembly complexity of prokaryotic genomes using short reads.使用短读长组装原核基因组的复杂性。
BMC Bioinformatics. 2010 Jan 12;11:21. doi: 10.1186/1471-2105-11-21.
2
Improving prokaryotic transposable elements identification using a combination of de novo and profile HMM methods.利用从头预测和 Profile-HMM 方法的组合提高原核转座元件的识别。
BMC Genomics. 2013 Oct 11;14:700. doi: 10.1186/1471-2164-14-700.
3
Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies.评估使用 Mate-Pairs 解决从头组装的短读 prokaryotic 重复的好处。
BMC Bioinformatics. 2011 Apr 13;12:95. doi: 10.1186/1471-2105-12-95.
4
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致(OLC)方法的最佳性能。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
5
Velvet: algorithms for de novo short read assembly using de Bruijn graphs.《天鹅绒:使用德布鲁因图进行从头短读长拼接的算法》
Genome Res. 2008 May;18(5):821-9. doi: 10.1101/gr.074492.107. Epub 2008 Mar 18.
6
FastEtch: A Fast Sketch-Based Assembler for Genomes.FastEtch:一种基于草图的快速基因组装配器。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1091-1106. doi: 10.1109/TCBB.2017.2737999. Epub 2017 Sep 11.
7
Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs.用于构建大型双向 de Bruijn 图的高效并行和外核算法。
BMC Bioinformatics. 2010 Nov 15;11:560. doi: 10.1186/1471-2105-11-560.
8
Parallelized short read assembly of large genomes using de Bruijn graphs.使用 de Bruijn 图进行大型基因组的并行短读序列组装。
BMC Bioinformatics. 2011 Aug 25;12:354. doi: 10.1186/1471-2105-12-354.
9
De novo assembly of short sequence reads.从头组装短序列读段。
Brief Bioinform. 2010 Sep;11(5):457-72. doi: 10.1093/bib/bbq020. Epub 2010 Aug 19.
10
Read length and repeat resolution: exploring prokaryote genomes using next-generation sequencing technologies.读长和重复分辨率:利用下一代测序技术探索原核生物基因组。
PLoS One. 2010 Jul 12;5(7):e11518. doi: 10.1371/journal.pone.0011518.

引用本文的文献

1
Applying the Safe-And-Complete Framework to Practical Genome Assembly.将安全且完整框架应用于实际基因组组装。
Lebniz Int Proc Inform. 2024;312. doi: 10.4230/LIPIcs.WABI.2024.8. Epub 2024 Aug 26.
2
Flowtigs: Safety in flow decompositions for assembly graphs.Flowtigs:装配图流分解中的安全性
iScience. 2024 Oct 25;27(12):111208. doi: 10.1016/j.isci.2024.111208. eCollection 2024 Dec 20.
3
When less is more: sketching with minimizers in genomics.少即是多:基因组学中的最小化器草图。

本文引用的文献

1
Maximum likelihood genome assembly.最大似然基因组组装
J Comput Biol. 2009 Aug;16(8):1101-16. doi: 10.1089/cmb.2009.0047.
2
De novo fragment assembly with short mate-paired reads: Does the read length matter?利用短配对末端读段进行从头片段组装:读段长度重要吗?
Genome Res. 2009 Feb;19(2):336-46. doi: 10.1101/gr.079053.108. Epub 2008 Dec 3.
3
Single-molecule DNA sequencing of a viral genome.病毒基因组的单分子DNA测序
Genome Biol. 2024 Oct 14;25(1):270. doi: 10.1186/s13059-024-03414-4.
4
MetaCompass: Reference-guided Assembly of Metagenomes.MetaCompass:宏基因组的参考引导组装
ArXiv. 2024 Mar 3:arXiv:2403.01578v1.
5
Theoretical Analysis of Sequencing Bioinformatics Algorithms and Beyond.测序生物信息学算法及其他方面的理论分析
Commun ACM. 2023 Jul;66(7):118-125. doi: 10.1145/3571723. Epub 2023 Jun 22.
6
Extraction and analysis of high-quality chloroplast DNA with reduced nuclear DNA for medicinal plants.提取和分析药用植物高质量的叶绿体 DNA,减少核 DNA。
BMC Biotechnol. 2024 Apr 18;24(1):20. doi: 10.1186/s12896-024-00843-8.
7
MEGAnnotator2: a pipeline for the assembly and annotation of microbial genomes.MEGAnnotator2:微生物基因组组装与注释流程
Microbiome Res Rep. 2023 Apr 30;2(2):15. doi: 10.20517/mrr.2022.21. eCollection 2023.
8
A safety framework for flow decomposition problems via integer linear programming.通过整数线性规划对流量分解问题进行安全框架构建。
Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad640.
9
Diversity of the type VI secretion systems in the spp. spp. 中 VI 型分泌系统的多样性
Microb Genom. 2023 Apr;9(4). doi: 10.1099/mgen.0.000986.
10
Comparative genome analysis using sample-specific string detection in accurate long reads.在准确的长读段中使用样本特异性字符串检测进行比较基因组分析。
Bioinform Adv. 2021 May 31;1(1):vbab005. doi: 10.1093/bioadv/vbab005. eCollection 2021.
Science. 2008 Apr 4;320(5872):106-9. doi: 10.1126/science.1150427.
4
Velvet: algorithms for de novo short read assembly using de Bruijn graphs.《天鹅绒:使用德布鲁因图进行从头短读长拼接的算法》
Genome Res. 2008 May;18(5):821-9. doi: 10.1101/gr.074492.107. Epub 2008 Mar 18.
5
ALLPATHS: de novo assembly of whole-genome shotgun microreads.ALLPATHS:全基因组鸟枪法测序短读段的从头组装。
Genome Res. 2008 May;18(5):810-20. doi: 10.1101/gr.7337908. Epub 2008 Mar 13.
6
Short read fragment assembly of bacterial genomes.细菌基因组的短读片段组装
Genome Res. 2008 Feb;18(2):324-30. doi: 10.1101/gr.7088808. Epub 2007 Dec 14.
7
SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing.SHARCGS,一种用于从头基因组测序的快速且高度准确的短读长拼接算法。
Genome Res. 2007 Nov;17(11):1697-706. doi: 10.1101/gr.6435207. Epub 2007 Oct 1.
8
Extending assembly of short DNA sequences to handle error.扩展短DNA序列的组装以处理错误。
Bioinformatics. 2007 Nov 1;23(21):2942-4. doi: 10.1093/bioinformatics/btm451. Epub 2007 Sep 24.
9
CRISPR provides acquired resistance against viruses in prokaryotes.CRISPR为原核生物提供了对病毒的适应性抗性。
Science. 2007 Mar 23;315(5819):1709-12. doi: 10.1126/science.1138140.
10
How repetitive are genomes?基因组的重复程度如何?
BMC Bioinformatics. 2006 Dec 22;7:541. doi: 10.1186/1471-2105-7-541.