• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

B-assembler:一种用于环形细菌基因组组装的工具。

B-assembler: a circular bacterial genome assembler.

机构信息

Informatics Institute, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35294, Birmingham, USA.

Department of Genetics, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35294, Birmingham, USA.

出版信息

BMC Genomics. 2022 May 11;23(Suppl 4):361. doi: 10.1186/s12864-022-08577-7.

DOI:10.1186/s12864-022-08577-7
PMID:35546658
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9092672/
Abstract

BACKGROUND

Accurate bacteria genome de novo assembly is fundamental to understand the evolution and pathogenesis of new bacteria species. The advent and popularity of Third-Generation Sequencing (TGS) enables assembly of bacteria genomes at an unprecedented speed. However, most current TGS assemblers were specifically designed for human or other species that do not have a circular genome. Besides, the repetitive DNA fragments in many bacterial genomes plus the high error rate of long sequencing data make it still very challenging to accurately assemble their genomes even with a relatively small genome size. Therefore, there is an urgent need for the development of an optimized method to address these issues.

RESULTS

We developed B-assembler, which is capable of assembling bacterial genomes when there are only long reads or a combination of short and long reads. B-assembler takes advantage of the structural resolving power of long reads and the accuracy of short reads if applicable. It first selects and corrects the ultra-long reads to get an initial contig. Then, it collects the reads overlapping with the ends of the initial contig. This two-round assembling procedure along with optimized error correction enables a high-confidence and circularized genome assembly. Benchmarked on both synthetic and real sequencing data of several species of bacterium, the results show that both long-read-only and hybrid-read modes can accurately assemble circular bacterial genomes free of structural errors and have fewer small errors compared to other assemblers.

CONCLUSIONS

B-assembler provides a better solution to bacterial genome assembly, which will facilitate downstream bacterial genome analysis.

摘要

背景

准确的细菌基因组从头组装对于理解新细菌物种的进化和发病机制至关重要。第三代测序(TGS)的出现和普及使细菌基因组的组装速度达到了前所未有的水平。然而,大多数当前的 TGS 组装器都是专门为人类或其他没有圆形基因组的物种设计的。此外,许多细菌基因组中的重复 DNA 片段以及长测序数据的高错误率使得即使基因组相对较小,准确组装它们的基因组仍然非常具有挑战性。因此,迫切需要开发一种优化的方法来解决这些问题。

结果

我们开发了 B-assembler,它能够在只有长读长或短读长和长读长组合的情况下组装细菌基因组。B-assembler 利用了长读长的结构解析能力和短读长的准确性(如果适用)。它首先选择和纠正超长读长以获得初始连续序列。然后,它收集与初始连续序列末端重叠的读长。这两轮组装过程以及优化的纠错功能,实现了高可信度和圆形化基因组组装。在几种细菌的合成和真实测序数据上进行基准测试的结果表明,长读长模式和混合读长模式都可以准确地组装无结构错误的圆形细菌基因组,与其他组装器相比,错误更少。

结论

B-assembler 为细菌基因组组装提供了更好的解决方案,这将有助于下游的细菌基因组分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efbb/9092672/af440a77d9fc/12864_2022_8577_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efbb/9092672/b2ba8045cc07/12864_2022_8577_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efbb/9092672/af440a77d9fc/12864_2022_8577_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efbb/9092672/b2ba8045cc07/12864_2022_8577_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efbb/9092672/af440a77d9fc/12864_2022_8577_Fig2_HTML.jpg

相似文献

1
B-assembler: a circular bacterial genome assembler.B-assembler:一种用于环形细菌基因组组装的工具。
BMC Genomics. 2022 May 11;23(Suppl 4):361. doi: 10.1186/s12864-022-08577-7.
2
Benchmarking Long-Read Assemblers for Genomic Analyses of Bacterial Pathogens Using Oxford Nanopore Sequencing.基于 Oxford Nanopore 测序的细菌病原体基因组分析的长读长组装器基准测试
Int J Mol Sci. 2020 Dec 1;21(23):9161. doi: 10.3390/ijms21239161.
3
Evaluating long-read de novo assembly tools for eukaryotic genomes: insights and considerations.评估真核生物基因组的长读长从头组装工具:见解与考虑。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad100. Epub 2023 Nov 24.
4
Genome assembly using Nanopore-guided long and error-free DNA reads.使用纳米孔引导的长且无错误的DNA reads进行基因组组装。
BMC Genomics. 2015 Apr 20;16(1):327. doi: 10.1186/s12864-015-1519-z.
5
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.单轮循环器:从短读长和长读长测序数据中解析细菌基因组组装结果
PLoS Comput Biol. 2017 Jun 8;13(6):e1005595. doi: 10.1371/journal.pcbi.1005595. eCollection 2017 Jun.
6
Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes.比较长读长测序技术在复杂细菌基因组混合组装中的应用。
Microb Genom. 2019 Sep;5(9). doi: 10.1099/mgen.0.000294. Epub 2019 Aug 30.
7
Assembler for de novo assembly of large genomes.从头组装大型基因组的装配器。
Proc Natl Acad Sci U S A. 2013 Sep 3;110(36):E3417-24. doi: 10.1073/pnas.1314090110. Epub 2013 Aug 21.
8
Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies.评估使用 Mate-Pairs 解决从头组装的短读 prokaryotic 重复的好处。
BMC Bioinformatics. 2011 Apr 13;12:95. doi: 10.1186/1471-2105-12-95.
9
Hybracter: enabling scalable, automated, complete and accurate bacterial genome assemblies.Hybracter:实现可扩展、自动化、完整和准确的细菌基因组组装。
Microb Genom. 2024 May;10(5). doi: 10.1099/mgen.0.001244.
10
High quality 3C de novo assembly and annotation of a multidrug resistant ST-111 Pseudomonas aeruginosa genome: Benchmark of hybrid and non-hybrid assemblers.高质量的 3C 从头组装和耐药 ST-111 铜绿假单胞菌基因组的注释:杂交和非杂交组装器的基准测试。
Sci Rep. 2020 Jan 29;10(1):1392. doi: 10.1038/s41598-020-58319-6.

引用本文的文献

1
Complete genome sequence of sp. strain NKBG15041-a fast-growing marine .sp. 菌株NKBG15041的全基因组序列——一种快速生长的海洋菌。
Microbiol Resour Announc. 2025 Sep 11;14(9):e0056025. doi: 10.1128/mra.00560-25. Epub 2025 Jul 31.
2
Complete genome sequence of sp. LKSZ1 isolated from a pond in the Suzukakedai campus of Tokyo Institute of Technology.从东京工业大学铃悬台校区一个池塘分离出的LKSZ1菌株的全基因组序列。
Microbiol Resour Announc. 2024 Oct 10;13(10):e0054924. doi: 10.1128/mra.00549-24. Epub 2024 Sep 9.
3
Many purported pseudogenes in bacterial genomes are bona fide genes.

本文引用的文献

1
Developing Diagnostic and Therapeutic Approaches to Bacterial Infections for a New Era: Implications of Globalization.为新时代开发针对细菌感染的诊断和治疗方法:全球化的影响
Antibiotics (Basel). 2020 Dec 16;9(12):916. doi: 10.3390/antibiotics9120916.
2
HASLR: Fast Hybrid Assembly of Long Reads.HASLR:长读段的快速混合组装
iScience. 2020 Aug 21;23(8):101389. doi: 10.1016/j.isci.2020.101389. Epub 2020 Jul 25.
3
The genome polishing tool POLCA makes fast and accurate corrections in genome assemblies.基因组精修工具 POLCA 可快速准确地对基因组组装进行修正。
许多在细菌基因组中被认为是假基因的基因实际上是真正的基因。
BMC Genomics. 2024 Apr 15;25(1):365. doi: 10.1186/s12864-024-10137-0.
4
Genomic Characterization of 2 Isolates from a Surgical Site Infection Reveals Large Genomic Inversion.两株手术部位感染分离株的基因组特征分析显示存在大型基因组倒位。
Pathog Immun. 2023 Oct 5;8(1):64-76. doi: 10.20411/pai.v8i1.606. eCollection 2023.
5
NCTC3000: a century of bacterial strain collecting leads to a rich genomic data resource.NCTC3000:百年菌株收集成就丰富的基因组学数据资源。
Microb Genom. 2023 May;9(5). doi: 10.1099/mgen.0.000976.
PLoS Comput Biol. 2020 Jun 26;16(6):e1007981. doi: 10.1371/journal.pcbi.1007981. eCollection 2020 Jun.
4
Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm.阿波罗:一种与测序技术无关、可扩展且准确的组装后处理算法。
Bioinformatics. 2020 Jun 1;36(12):3669-3679. doi: 10.1093/bioinformatics/btaa179.
5
Complete, closed bacterial genomes from microbiomes using nanopore sequencing.利用纳米孔测序从微生物组中获得完整的、封闭的细菌基因组。
Nat Biotechnol. 2020 Jun;38(6):701-707. doi: 10.1038/s41587-020-0422-6. Epub 2020 Feb 10.
6
Fast and accurate long-read assembly with wtdbg2.使用 wtdbg2 实现快速准确的长读长序列组装。
Nat Methods. 2020 Feb;17(2):155-158. doi: 10.1038/s41592-019-0669-3. Epub 2019 Dec 9.
7
NextPolish: a fast and efficient genome polishing tool for long-read assembly.NextPolish:一种用于长读长组装的快速高效基因组精修工具。
Bioinformatics. 2020 Apr 1;36(7):2253-2255. doi: 10.1093/bioinformatics/btz891.
8
Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases.串联重复导致序列组装错误,并对基因组和蛋白质数据库提出了多层次的挑战。
Nucleic Acids Res. 2019 Dec 2;47(21):10994-11006. doi: 10.1093/nar/gkz841.
9
Assembly of long, error-prone reads using repeat graphs.使用重复图组装长的、易错的读取。
Nat Biotechnol. 2019 May;37(5):540-546. doi: 10.1038/s41587-019-0072-8. Epub 2019 Apr 1.
10
De novo assembly of bacterial genomes with repetitive DNA regions by dnaasm application.应用 dnaasm 对具有重复 DNA 区域的细菌基因组进行从头组装。
BMC Bioinformatics. 2018 Jul 18;19(1):273. doi: 10.1186/s12859-018-2281-4.