Suppr超能文献

利用低覆盖度、大片段、短读长数据快速准确生成增强质量的假单胞菌基因组草图序列。

Use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft Pseudomonas genome sequences.

机构信息

Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada.

出版信息

PLoS One. 2011;6(11):e27199. doi: 10.1371/journal.pone.0027199. Epub 2011 Nov 2.

Abstract

Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an "enhanced-quality draft" genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2-5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains.

摘要

下一代基因组技术极大地加速了基因组研究的步伐,同时也增加了我们对草案基因组序列的依赖。尽管基因组标准协会(Genomics Standards Consortium)等组织为促进基因组标准做出了巨大努力,但已发表的草案基因组之间仍然普遍缺乏统一性,这给下游比较分析带来了挑战。当使用经常具有大量低质量测序片段的标准草案基因组时,这种缺乏统一性尤其成问题。在这里,我们提出了一种“增强质量草案”基因组的建议,该建议至少可以鉴定出 95%的编码序列,从而有效地提供基因组基因成分的完整说明。通过结合使用小片段和大片段的下一代、配对末端测序,可以轻松获得增强质量的草案基因组。我们通过重新测序植物病原细菌丁香假单胞菌 pv. phaseolicola 1448A(Pph 1448A)来说明增强质量草案基因组的生成,该细菌具有已发布的、封闭的 5.93 Mbp 基因组序列。我们使用 Illumina 配对末端和 mate-pair 测序的组合,令人惊讶的是,发现具有 100x 配对末端覆盖度的从头组装和具有低至 2-5x 覆盖度的 mate-pair 测序比基于更高覆盖度的组装要好得多。大量快速且低成本生成的增强质量草案基因组序列将特别有益于微生物诊断和生物安全,因为它们依赖于对潜在危险克隆与密切相关的良性菌株进行精确区分。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6cfe/3206934/b084760079b9/pone.0027199.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验