Suppr超能文献

超深度、长读长纳米孔测序模拟微生物群落标准品。

Ultra-deep, long-read nanopore sequencing of mock microbial community standards.

机构信息

Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, Edgbaston, B15 2TT, UK.

Zymo Research Corporation, 17062 Murphy Ave., Irvine, CA 92614, USA.

出版信息

Gigascience. 2019 May 1;8(5). doi: 10.1093/gigascience/giz043.

Abstract

BACKGROUND

Long sequencing reads are information-rich: aiding de novo assembly and reference mapping, and consequently have great potential for the study of microbial communities. However, the best approaches for analysis of long-read metagenomic data are unknown. Additionally, rigorous evaluation of bioinformatics tools is hindered by a lack of long-read data from validated samples with known composition.

FINDINGS

We sequenced 2 commercially available mock communities containing 10 microbial species (ZymoBIOMICS Microbial Community Standards) with Oxford Nanopore GridION and PromethION. Both communities and the 10 individual species isolates were also sequenced with Illumina technology. We generated 14 and 16 gigabase pairs from 2 GridION flowcells and 150 and 153 gigabase pairs from 2 PromethION flowcells for the evenly distributed and log-distributed communities, respectively. Read length N50 ranged between 5.3 and 5.4 kilobase pairs over the 4 sequencing runs. Basecalls and corresponding signal data are made available (4.2 TB in total). Alignment to Illumina-sequenced isolates demonstrated the expected microbial species at anticipated abundances, with the limit of detection for the lowest abundance species below 50 cells (GridION). De novo assembly of metagenomes recovered long contiguous sequences without the need for pre-processing techniques such as binning.

CONCLUSIONS

We present ultra-deep, long-read nanopore datasets from a well-defined mock community. These datasets will be useful for those developing bioinformatics methods for long-read metagenomics and for the validation and comparison of current laboratory and software pipelines.

摘要

背景

长测序读长信息丰富:有助于从头组装和参考映射,因此对微生物群落的研究具有很大的潜力。然而,目前还不清楚分析长读长宏基因组数据的最佳方法。此外,由于缺乏来自具有已知组成的经过验证的样本的长读长数据,因此对生物信息学工具的严格评估受到阻碍。

发现

我们使用 Oxford Nanopore GridION 和 PromethION 对包含 10 种微生物物种的 2 种商业可用模拟群落(ZymoBIOMICS 微生物群落标准)进行了测序。这两个群落和 10 个单独的物种分离株也使用 Illumina 技术进行了测序。我们从 2 个 GridION 流动池分别生成了 14 和 16 千兆碱基对,从 2 个 PromethION 流动池分别生成了 150 和 153 千兆碱基对,用于均匀分布和对数分布的群落。4 次测序运行的读长 N50 范围在 5.3 到 5.4 千碱基对之间。碱基调用和相应的信号数据可用(总共 4.2TB)。与 Illumina 测序分离株的比对表明,预期的微生物物种在预期的丰度下,最低丰度物种的检测限低于 50 个细胞(GridION)。宏基因组的从头组装恢复了长的连续序列,而无需进行预处理技术,如分箱。

结论

我们提出了来自明确定义的模拟群落的超深度、长读长纳米孔数据集。这些数据集将有助于开发长读长宏基因组学的生物信息学方法,并验证和比较当前的实验室和软件管道。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2716/6520541/9969a8ac6583/giz043fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验