Suppr超能文献

利用 Meta-NanoSim 对宏基因组纳米孔测序数据进行特征描述和模拟。

Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim.

机构信息

Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 4S6, Canada.

Bioinformatics Graduate Program, University of British Columbia, Genome Sciences Centre, BCCA 100-570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada.

出版信息

Gigascience. 2023 Mar 20;12. doi: 10.1093/gigascience/giad013.

Abstract

BACKGROUND

Nanopore sequencing is crucial to metagenomic studies as its kilobase-long reads can contribute to resolving genomic structural differences among microbes. However, sequencing platform-specific challenges, including high base-call error rate, nonuniform read lengths, and the presence of chimeric artifacts, necessitate specifically designed analytical algorithms. The use of simulated datasets with characteristics that are true to the sequencing platform under evaluation is a cost-effective way to assess the performance of bioinformatics tools with the ground truth in a controlled environment.

RESULTS

Here, we present Meta-NanoSim, a fast and versatile utility that characterizes and simulates the unique properties of nanopore metagenomic reads. It improves upon state-of-the-art methods on microbial abundance estimation through a base-level quantification algorithm. Meta-NanoSim can simulate complex microbial communities composed of both linear and circular genomes and can stream reference genomes from online servers directly. Simulated datasets showed high congruence with experimental data in terms of read length, error profiles, and abundance levels. We demonstrate that Meta-NanoSim simulated data can facilitate the development of metagenomic algorithms and guide experimental design through a metagenome assembly benchmarking task.

CONCLUSIONS

The Meta-NanoSim characterization module investigates read features, including chimeric information and abundance levels, while the simulation module simulates large and complex multisample microbial communities with different abundance profiles. All trained models and the software are freely accessible at GitHub: https://github.com/bcgsc/NanoSim.

摘要

背景

纳米孔测序对于宏基因组研究至关重要,因为其长达千碱基的读长有助于解决微生物之间的基因组结构差异。然而,测序平台特有的挑战,包括高碱基调用错误率、不均匀的读长和嵌合伪影的存在,需要专门设计分析算法。使用具有评估测序平台特征的模拟数据集是在受控环境中使用真实数据评估生物信息学工具性能的一种经济有效的方法。

结果

在这里,我们提出了 Meta-NanoSim,这是一种快速而通用的实用程序,用于描述和模拟纳米孔宏基因组读长的独特特性。它通过一种基于碱基的定量算法改进了微生物丰度估计的最先进方法。Meta-NanoSim 可以模拟由线性和圆形基因组组成的复杂微生物群落,并可以直接从在线服务器流式传输参考基因组。模拟数据集在读长、错误分布和丰度水平方面与实验数据高度一致。我们通过宏基因组组装基准测试任务证明,Meta-NanoSim 模拟数据可以促进宏基因组算法的开发并指导实验设计。

结论

Meta-NanoSim 的特征描述模块研究了读长的特征,包括嵌合信息和丰度水平,而模拟模块则模拟了具有不同丰度分布的大型复杂多样本微生物群落。所有训练的模型和软件都可以在 GitHub 上免费获取:https://github.com/bcgsc/NanoSim。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8976/10025935/1d2761cadae7/giad013fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验