Suppr超能文献

细菌基因组从头组装策略比较。

Comparison of De Novo Assembly Strategies for Bacterial Genomes.

机构信息

Key Laboratory of Animal Diseases and Human Health of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China.

Collage of Veterinary Medicine, Agricultural University, Chengdu 611130, China.

出版信息

Int J Mol Sci. 2021 Jul 17;22(14):7668. doi: 10.3390/ijms22147668.

Abstract

(1) Background: Short-read sequencing allows for the rapid and accurate analysis of the whole bacterial genome but does not usually enable complete genome assembly. Long-read sequencing greatly assists with the resolution of complex bacterial genomes, particularly when combined with short-read Illumina data. However, it is not clear how different assembly strategies affect genomic accuracy, completeness, and protein prediction. (2) Methods: we compare different assembly strategies for , which causes Glässer's disease, characterized by fibrinous polyserositis and arthritis, in swine by using Illumina sequencing and long reads from the sequencing platforms of either Oxford Nanopore Technologies (ONT) or SMRT Pacific Biosciences (PacBio). (3) Results: Assembly with either PacBio or ONT reads, followed by polishing with Illumina reads, facilitated high-quality genome reconstruction and was superior to the long-read-only assembly and hybrid-assembly strategies when evaluated in terms of accuracy and completeness. An equally excellent method was correction with Homopolish after the ONT-only assembly, which had the advantage of avoiding hybrid sequencing with Illumina. Furthermore, by aligning transcripts to assembled genomes and their predicted CDSs, the sequencing errors of the ONT assembly were mainly indels that were generated when homopolymer regions were sequenced, thus critically affecting protein prediction. Polishing can fill indels and correct mistakes. (4) Conclusions: The assembly of bacterial genomes can be directly achieved by using long-read sequencing techniques. To maximize assembly accuracy, it is essential to polish the assembly with homologous sequences of related genomes or sequencing data from short-read technology.

摘要

(1) 背景:短读测序能够快速准确地分析整个细菌基因组,但通常无法完成基因组组装。长读测序极大地帮助解析复杂的细菌基因组,尤其是与短读 Illumina 数据结合使用时。然而,不同的组装策略如何影响基因组的准确性、完整性和蛋白质预测,目前还不清楚。(2) 方法:我们使用 Illumina 测序和来自 Oxford Nanopore Technologies (ONT) 或 SMRT Pacific Biosciences (PacBio) 的测序平台的长读序列,比较了导致猪 Glässer 病(以纤维蛋白性多浆膜炎和关节炎为特征)的 的不同组装策略。(3) 结果:使用 PacBio 或 ONT 读长进行组装,然后用 Illumina 读长进行抛光,有助于高质量的基因组重建,在准确性和完整性方面优于仅使用长读长的组装和混合组装策略。在仅使用 ONT 组装后使用 Homopolish 进行校正也是一种同样优秀的方法,其优点是避免了与 Illumina 的混合测序。此外,通过将转录本与组装的基因组及其预测的 CDS 进行比对,可以发现 ONT 组装的测序错误主要是在测序同源多聚体区域时产生的插入缺失,这对蛋白质预测有很大影响。抛光可以填补插入缺失并纠正错误。(4) 结论:可以直接使用长读测序技术来组装细菌基因组。为了最大限度地提高组装的准确性,至关重要的是使用相关基因组的同源序列或短读技术的测序数据来抛光组装。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b2e/8306402/3fdc251d5019/ijms-22-07668-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验