Suppr超能文献

使用 Oxford Nanopore 测序组装高度重复的黄单胞菌 TALomes。

Assembling highly repetitive Xanthomonas TALomes using Oxford Nanopore sequencing.

机构信息

Institute of Computer Science, Martin Luther University Halle-Wittenberg, 06120, Halle, Germany.

Department of Plant Biotechnology, Leibniz Universität Hannover, 30419, Hannover, Germany.

出版信息

BMC Genomics. 2023 Mar 27;24(1):151. doi: 10.1186/s12864-023-09228-1.

Abstract

BACKGROUND

Most plant-pathogenic Xanthomonas bacteria harbor transcription activator-like effector (TALE) genes, which function as transcriptional activators of host plant genes and support infection. The entire repertoire of up to 29 TALE genes of a Xanthomonas strain is also referred to as TALome. The DNA-binding domain of TALEs is comprised of highly conserved repeats and TALE genes often occur in gene clusters, which precludes the assembly of TALE-carrying Xanthomonas genomes based on standard sequencing approaches.

RESULTS

Here, we report the successful assembly of the 5 Mbp genomes of five Xanthomonas strains from Oxford Nanopore Technologies (ONT) sequencing data. For one of these strains, Xanthomonas oryzae pv. oryzae (Xoo) PXO35, we illustrate why Illumina short reads and longer PacBio reads are insufficient to fully resolve the genome. While ONT reads are perfectly suited to yield highly contiguous genomes, they suffer from a specific error profile within homopolymers. To still yield complete and correct TALomes from ONT assemblies, we present a computational correction pipeline specifically tailored to TALE genes, which yields at least comparable accuracy as Illumina-based polishing. We further systematically assess the ONT-based pipeline for its multiplexing capacity and find that, combined with computational correction, the complete TALome of Xoo PXO35 could have been reconstructed from less than 20,000 ONT reads.

CONCLUSIONS

Our results indicate that multiplexed ONT sequencing combined with a computational correction of TALE genes constitutes a highly capable tool for characterizing the TALomes of huge collections of Xanthomonas strains in the future.

摘要

背景

大多数植物病原黄单胞菌都携带有转录激活因子样效应物(TALE)基因,这些基因作为宿主植物基因的转录激活因子,支持细菌的侵染。一个黄单胞菌菌株的全部 TALE 基因(多达 29 个)也被称为 TALome。TALEs 的 DNA 结合域由高度保守的重复序列组成,并且 TALE 基因经常出现在基因簇中,这使得基于标准测序方法组装携带 TALE 的黄单胞菌基因组变得不可行。

结果

在这里,我们报告了使用牛津纳米孔技术(ONT)测序数据成功组装了五个黄单胞菌菌株的 5 Mbp 基因组。对于其中一个菌株,即水稻白叶枯病菌(Xoo) PXO35,我们说明了为什么 Illumina 短读长和较长的 PacBio 读长不足以完全解析基因组。虽然 ONT 读长非常适合产生高度连续的基因组,但它们在同源多聚体中存在特定的错误模式。为了仍能从 ONT 组装中获得完整和正确的 TALome,我们专门针对 TALE 基因提出了一种计算校正管道,该管道产生的准确性至少与基于 Illumina 的抛光相当。我们进一步系统地评估了基于 ONT 的管道的多路复用能力,并发现,结合计算校正,来自不到 20,000 个 ONT 读长的 Xoo PXO35 的完整 TALome 本可以被重建。

结论

我们的结果表明,多路复用的 ONT 测序结合 TALE 基因的计算校正构成了未来大规模鉴定黄单胞菌菌株 TALome 的非常有能力的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b067/10045945/79131025e165/12864_2023_9228_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验