• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用混合测序技术对基因组进行特征分析的基因组组装工具比较。

Comparisons of genome assembly tools for characterization of genomes using hybrid sequencing technologies.

机构信息

Department of Microbiology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand.

Research and Diagnostic Center for Emerging Infectious Diseases (RCEID), Khon Kaen University, Khon Kaen, Thailand.

出版信息

PeerJ. 2024 Aug 29;12:e17964. doi: 10.7717/peerj.17964. eCollection 2024.

DOI:10.7717/peerj.17964
PMID:39221271
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11366230/
Abstract

BACKGROUND

Next-generation sequencing of , the infectious agent causing tuberculosis, is improving the understanding of genomic diversity of circulating lineages and strain-types, and informing knowledge of drug resistance mutations. An increasingly popular approach to characterizing genomes (size: 4.4 Mbp) and variants (., single nucleotide polymorphisms (SNPs)) involves the assembly of sequence data.

METHODS

We compared the performance of genome assembly tools (Unicycler, RagOut, and RagTag) on sequence data from nine drug resistant isolates (multi-drug (MDR) = 1; pre-extensively-drug (pre-XDR) = 8) generated using Illumina HiSeq, Oxford Nanopore Technology (ONT) PromethION, and PacBio platforms.

RESULTS

Our investigation found that Unicycler-based assemblies had significantly higher genome completeness (98.7%; values = 0.01) compared to other assembler tools (RagOut = 98.6%, and RagTag = 98.6%). The genome assembly sizes (bp) across isolates and sequencers based on RagOut was significantly longer ( values < 0.001) (4,418,574 ± 8,824 bp) than Unicycler and RagTag assemblies (Unicycler = 4,377,642 ± 55,257 bp, and RagTag = 4,380,711 ± 51,164 bp). RagOut-based assemblies had the fewest contigs (32) and the longest genome size (4,418,574 bp; . H37Rv reference size 4,411,532 bp) and therefore were chosen for downstream analysis. Pan-genome analysis of Illumina and PacBio hybrid assemblies revealed the greatest number of detected genes (4,639 genes; H37Rv reference contains 3,976 genes), while Illumina and ONT hybrid assemblies produced the highest number of SNPs. The number of genes from hybrid assemblies with ONT and PacBio long-reads (mean: 4,620 genes) was greater than short-read assembly alone (4,478 genes). All nine RagOut hybrid genome assemblies detected known mutations in genes associated with MDR-TB and pre-XDR-TB.

CONCLUSIONS

Unicycler software performed the best in terms of achieving contiguous genomes, whereas RagOut improved the quality of Unicycler's genome assemblies by providing a longer genome size. Overall, our approach has demonstrated that short-read and long-read hybrid assembly can provide a more complete genome assembly than short-read assembly alone by detecting pan-genomes and more genes, including IS, and SNPs.

摘要

背景

下一代测序技术能够对引起结核病的病原体进行测序,这有助于提高对流行谱系和菌株类型的基因组多样性的理解,并为耐药突变提供相关知识。一种越来越流行的方法是对结核分枝杆菌基因组(大小为 4.4 Mbp)和变体(.,单核苷酸多态性(SNP))进行特征描述,该方法涉及到序列数据的组装。

方法

我们比较了 9 株耐药结核分枝杆菌(耐多药(MDR)= 1;耐多药前(pre-XDR)= 8)的序列数据,使用 Illumina HiSeq、Oxford Nanopore Technology(ONT)PromethION 和 PacBio 平台生成,比较了三种基因组组装工具(Unicycler、RagOut 和 RagTag)的性能。

结果

我们的研究发现,基于 Unicycler 的组装具有更高的基因组完整性(98.7%; 值= 0.01),与其他组装工具(RagOut = 98.6%,和 RagTag = 98.6%)相比。在基于 RagOut 的组装中,不同的分离株和测序仪的基因组组装大小(bp)显著更长( 值< 0.001)(4,418,574 ± 8,824 bp),而不是 Unicycler 和 RagTag 组装(Unicycler = 4,377,642 ± 55,257 bp,和 RagTag = 4,380,711 ± 51,164 bp)。基于 RagOut 的组装具有最少的 contigs(32)和最长的基因组大小(4,418,574 bp;. H37Rv 参考大小 4,411,532 bp),因此被选择用于下游分析。Illumina 和 PacBio 混合组装的泛基因组分析显示,检测到的基因数量最多(4,639 个基因; H37Rv 参考包含 3,976 个基因),而 Illumina 和 ONT 混合组装产生的 SNP 数量最多。具有 ONT 和 PacBio 长读长的混合组装的基因数量(平均值:4,620 个基因)大于仅短读长组装的基因数量(4,478 个基因)。基于 RagOut 的所有 9 个混合基因组组装都检测到了与耐多药结核病和耐多药前结核病相关的基因中的已知突变。

结论

在实现连续基因组方面,Unicycler 软件表现最好,而 RagOut 通过提供更长的基因组大小,提高了 Unicycler 基因组组装的质量。总的来说,我们的方法表明,短读长和长读长混合组装可以通过检测泛基因组和更多的基因,包括插入序列(IS)和 SNP,提供比短读长组装更完整的基因组组装。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ae/11366230/77d7838ecd82/peerj-12-17964-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ae/11366230/bbbbc6ba0793/peerj-12-17964-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ae/11366230/443ca56d128c/peerj-12-17964-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ae/11366230/c5fd855e52b2/peerj-12-17964-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ae/11366230/77d7838ecd82/peerj-12-17964-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ae/11366230/bbbbc6ba0793/peerj-12-17964-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ae/11366230/443ca56d128c/peerj-12-17964-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ae/11366230/c5fd855e52b2/peerj-12-17964-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ae/11366230/77d7838ecd82/peerj-12-17964-g004.jpg

相似文献

1
Comparisons of genome assembly tools for characterization of genomes using hybrid sequencing technologies.利用混合测序技术对基因组进行特征分析的基因组组装工具比较。
PeerJ. 2024 Aug 29;12:e17964. doi: 10.7717/peerj.17964. eCollection 2024.
2
Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes.比较长读长测序技术在复杂细菌基因组混合组装中的应用。
Microb Genom. 2019 Sep;5(9). doi: 10.1099/mgen.0.000294. Epub 2019 Aug 30.
3
Benchmarking hybrid assembly approaches for genomic analyses of bacterial pathogens using Illumina and Oxford Nanopore sequencing.使用 Illumina 和 Oxford Nanopore 测序对细菌病原体进行基因组分析的混合组装方法的基准测试。
BMC Genomics. 2020 Sep 14;21(1):631. doi: 10.1186/s12864-020-07041-8.
4
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.单轮循环器:从短读长和长读长测序数据中解析细菌基因组组装结果
PLoS Comput Biol. 2017 Jun 8;13(6):e1005595. doi: 10.1371/journal.pcbi.1005595. eCollection 2017 Jun.
5
Evaluating long-read de novo assembly tools for eukaryotic genomes: insights and considerations.评估真核生物基因组的长读长从头组装工具:见解与考虑。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad100. Epub 2023 Nov 24.
6
Completion of draft bacterial genomes by long-read sequencing of synthetic genomic pools.通过合成基因组文库的长读长测序完成细菌基因组草图
BMC Genomics. 2020 Jul 29;21(1):519. doi: 10.1186/s12864-020-06910-6.
7
Polishing the Oxford Nanopore long-read assemblies of bacterial pathogens with Illumina short reads to improve genomic analyses.用 Illumina 短读序列对牛津纳米孔长读序列组装的细菌病原体进行打磨,以改进基因组分析。
Genomics. 2021 May;113(3):1366-1377. doi: 10.1016/j.ygeno.2021.03.018. Epub 2021 Mar 11.
8
Complete hybrid genome assembly of clinical multidrug-resistant isolates enables comprehensive identification of antimicrobial-resistance genes and plasmids.完成临床多重耐药株的全混合基因组组装,可全面鉴定抗生素耐药基因和质粒。
Microb Genom. 2019 Nov;5(11). doi: 10.1099/mgen.0.000312.
9
A complete high-quality MinION nanopore assembly of an extensively drug-resistant Mycobacterium tuberculosis Beijing lineage strain identifies novel variation in repetitive PE/PPE gene regions.对一株广泛耐药结核分枝杆菌北京家族菌株的完整高质量 MinION 纳米孔组装,鉴定了重复 PE/PPE 基因区域的新型变异。
Microb Genom. 2018 Jul;4(7). doi: 10.1099/mgen.0.000188. Epub 2018 Jun 15.
10
Ragout-a reference-assisted assembly tool for bacterial genomes.烩菜——一种用于细菌基因组的参考辅助组装工具。
Bioinformatics. 2014 Jun 15;30(12):i302-9. doi: 10.1093/bioinformatics/btu280.

本文引用的文献

1
Multi-platform whole genome sequencing for tuberculosis clinical and surveillance applications.用于结核病临床和监测应用的多平台全基因组测序
Sci Rep. 2024 Mar 3;14(1):5201. doi: 10.1038/s41598-024-55865-1.
2
Functional genetic variation in / genes contributes to diversity in lineages and potential interactions with the human host.基因中的功能性遗传变异导致了谱系的多样性以及与人类宿主潜在的相互作用。
Front Microbiol. 2023 Oct 9;14:1244319. doi: 10.3389/fmicb.2023.1244319. eCollection 2023.
3
Evaluation of Mycobacterium tuberculosis enrichment in metagenomic samples using ONT adaptive sequencing and amplicon sequencing for identification and variant calling.
利用 ONT 自适应测序和扩增子测序评估宏基因组样本中的结核分枝杆菌富集情况,用于鉴定和变异 calling。
Sci Rep. 2023 Mar 31;13(1):5237. doi: 10.1038/s41598-023-32378-x.
4
Advantages of long- and short-reads sequencing for the hybrid investigation of the genome.长读长和短读长测序在基因组混合研究中的优势。
Front Microbiol. 2023 Feb 2;14:1104456. doi: 10.3389/fmicb.2023.1104456. eCollection 2023.
5
Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing.利用 RagTag 进行自动化组装支架,为高通量基因组编辑提升了一个新的番茄系统。
Genome Biol. 2022 Dec 15;23(1):258. doi: 10.1186/s13059-022-02823-7.
6
Twelve years of SAMtools and BCFtools.SAMtools 和 BCFtools 十二年。
Gigascience. 2021 Feb 16;10(2). doi: 10.1093/gigascience/giab008.
7
Whole-genome sequence analysis and comparisons between drug-resistance mutations and minimum inhibitory concentrations of Mycobacterium tuberculosis isolates causing M/XDR-TB.导致耐多药/广泛耐药结核病的结核分枝杆菌分离株的全基因组序列分析以及耐药性突变与最低抑菌浓度之间的比较。
PLoS One. 2020 Dec 31;15(12):e0244829. doi: 10.1371/journal.pone.0244829. eCollection 2020.
8
Pangenome Analysis of Reveals Core-Drug Targets and Screening of Promising Lead Compounds for Drug Discovery.[物种名称]的泛基因组分析揭示核心药物靶点并筛选有前景的先导化合物用于药物发现 。(你提供的原文中“of”后面缺少具体物种名称等关键信息,我按照合理推测补充了[物种名称],你可根据实际情况修改完善)
Antibiotics (Basel). 2020 Nov 17;9(11):819. doi: 10.3390/antibiotics9110819.
9
Benchmarking hybrid assembly approaches for genomic analyses of bacterial pathogens using Illumina and Oxford Nanopore sequencing.使用 Illumina 和 Oxford Nanopore 测序对细菌病原体进行基因组分析的混合组装方法的基准测试。
BMC Genomics. 2020 Sep 14;21(1):631. doi: 10.1186/s12864-020-07041-8.
10
Targeted-Sequencing Workflows for Comprehensive Drug Resistance Profiling of Mycobacterium tuberculosis Cultures Using Two Commercial Sequencing Platforms: Comparison of Analytical and Diagnostic Performance, Turnaround Time, and Cost.基于两种商业化测序平台的结核分枝杆菌培养物全面药物耐药性分析的靶向测序流程:分析和诊断性能、周转时间和成本的比较。
Clin Chem. 2020 Jun 1;66(6):809-820. doi: 10.1093/clinchem/hvaa092.