Suppr超能文献

比较长读长测序组装工具,以探索一种可持续的低成本、低基础设施方法利用牛津纳米孔测序对耐抗菌细菌进行测序的潜力。

Comparing Long-Read Assemblers to Explore the Potential of a Sustainable Low-Cost, Low-Infrastructure Approach to Sequence Antimicrobial Resistant Bacteria With Oxford Nanopore Sequencing.

作者信息

Boostrom Ian, Portal Edward A R, Spiller Owen B, Walsh Timothy R, Sands Kirsty

机构信息

Division of Infection and Immunity, Department of Medical Microbiology, Cardiff University, Cardiff, United Kingdom.

Department of Zoology, Ineos Oxford Institute for Antimicrobial Research, University of Oxford, Oxford, United Kingdom.

出版信息

Front Microbiol. 2022 Mar 3;13:796465. doi: 10.3389/fmicb.2022.796465. eCollection 2022.

Abstract

Long-read sequencing (LRS) can resolve repetitive regions, a limitation of short read (SR) data. Reduced cost and instrument size has led to a steady increase in LRS across diagnostics and research. Here, we re-basecalled FAST5 data sequenced between 2018 and 2021 and analyzed the data in relation to gDNA across a large dataset ( = 200) spanning a wide GC content (25-67%). We examined whether re-basecalled data would improve the hybrid assembly, and, for a smaller cohort, compared long read (LR) assemblies in the context of antimicrobial resistance (AMR) genes and mobile genetic elements. We included a cost analysis when comparing SR and LR instruments. We compared the R9 and R10 chemistries and reported not only a larger yield but increased read quality with R9 flow cells. There were often discrepancies with ARG presence/absence and/or variant detection in LR assemblies. Flye-based assemblies were generally efficient at detecting the presence of ARG on both the chromosome and plasmids. Raven performed more quickly but inconsistently recovered small plasmids, notably a ∼15-kb Col-like plasmid harboring . Canu assemblies were the most fragmented, with genome sizes larger than expected. LR assemblies failed to consistently determine multiple copies of the same ARG as identified by the Unicycler reference. Even with improvements to ONT chemistry and basecalling, long-read assemblies can lead to misinterpretation of data. If LR data are currently being relied upon, it is necessary to perform multiple assemblies, although this is resource (computing) intensive and not yet readily available/useable.

摘要

长读长测序(LRS)可以解析重复区域,这是短读长(SR)数据的一个局限性。成本的降低和仪器尺寸的减小导致LRS在诊断和研究中的应用稳步增加。在这里,我们对2018年至2021年期间测序的FAST5数据进行了重新碱基识别,并在一个跨越广泛GC含量(25%-67%)的大型数据集(n = 200)中分析了与基因组DNA(gDNA)相关的数据。我们研究了重新碱基识别的数据是否会改善混合组装,并且对于一个较小的队列,在抗菌药物耐药性(AMR)基因和移动遗传元件的背景下比较了长读长(LR)组装。在比较SR和LR仪器时,我们进行了成本分析。我们比较了R9和R10化学方法,不仅报告了R9流动槽有更高的产量,而且读长质量也有所提高。在LR组装中,ARG的存在/缺失和/或变异检测常常存在差异。基于Flye的组装通常能有效地检测染色体和质粒上ARG的存在。Raven运行速度更快,但在回收小质粒方面不一致,特别是一个携带blaCTX-M-15的约15 kb的Col样质粒。Canu组装的片段化程度最高,基因组大小大于预期。LR组装未能一致地确定与Unicycler参考所识别的相同ARG的多个拷贝。即使ONT化学方法和碱基识别有所改进,长读长组装仍可能导致数据的错误解读。如果目前依赖LR数据,则有必要进行多次组装,尽管这需要大量资源(计算),并且目前还不容易获得/使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901a/8928191/68cd764461a0/fmicb-13-796465-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验