基于 rRNA 操纵子区域的微生物鉴定:长读序列宏基因组学的数据库和工具。

Microbial Identification Using rRNA Operon Region: Database and Tool for Metataxonomics with Long-Read Sequence.

机构信息

eGnome, Inc, Seoul, Republic of Korea.

Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National Universitygrid.31501.36, Seoul, Republic of Korea.

出版信息

Microbiol Spectr. 2022 Apr 27;10(2):e0201721. doi: 10.1128/spectrum.02017-21. Epub 2022 Mar 30.

Abstract

Recent development of long-read sequencing platforms has enabled researchers to explore bacterial community structure through analysis of full-length 16S rRNA gene (∼1,500 bp) or 16S-ITS-23S rRNA operon region (∼4,300 bp), resulting in higher taxonomic resolution than short-read sequencing platforms. Despite the potential of long-read sequencing in metagenomics, resources and protocols for this technology are scarce. Here, we describe MIrROR, the database and analysis tool for metataxonomics using the bacterial 16S-ITS-23S rRNA operon region. We collected 16S-ITS-23S rRNA operon sequences extracted from bacterial genomes from NCBI GenBank and performed curation. A total of 97,781 16S-ITS-23S rRNA operon sequences covering 9,485 species from 43,653 genomes were obtained. For user convenience, we provide an analysis tool based on a mapping strategy that can be used for taxonomic profiling with MIrROR database. To benchmark MIrROR, we compared performance against publicly available databases and tool with mock communities and simulated data sets. Our platform showed promising results in terms of the number of species covered and the accuracy of classification. To encourage active 16S-ITS-23S rRNA operon analysis in the field, BLAST function and taxonomic profiling results with 16S-ITS-23S rRNA operon studies, which have been reported as BioProject on NCBI are provided. MIrROR (http://mirror.egnome.co.kr/) will be a useful platform for researchers who want to perform high-resolution metagenome analysis with a cost-effective sequencer such as MinION from Oxford Nanopore Technologies. Metabarcoding is a powerful tool to investigate community diversity in an economic and efficient way by amplifying a specific gene marker region. With the advancement of long-read sequencing technologies, the field of metabarcoding has entered a new phase. The technologies have brought a need for development in several areas, including new markers that long-read can cover, database for the markers, tools that reflect long-read characteristics, and compatibility with downstream analysis tools. By constructing MIrROR, we met the need for a database and tools for the 16S-ITS-23S rRNA operon region, which has recently been shown to have sufficient resolution at the species level. Bacterial community analysis using the 16S-ITS-23S rRNA operon region with MIrROR will provide new insights from various research fields.

摘要

最近长读测序平台的发展使研究人员能够通过分析全长 16S rRNA 基因(约 1500bp)或 16S-ITS-23S rRNA 操纵子区域(约 4300bp)来探索细菌群落结构,从而比短读测序平台具有更高的分类分辨率。尽管长读测序在宏基因组学中有很大的潜力,但该技术的资源和协议却很少。在这里,我们描述了 MIrROR,这是一个用于细菌 16S-ITS-23S rRNA 操纵子区域的分类学分析的数据库和分析工具。我们从 NCBI GenBank 中收集了细菌基因组中提取的 16S-ITS-23S rRNA 操纵子序列,并进行了整理。总共获得了 97781 条 16S-ITS-23S rRNA 操纵子序列,涵盖了来自 43653 个基因组的 9485 个物种。为了方便用户,我们提供了一个基于映射策略的分析工具,可用于使用 MIrROR 数据库进行分类分析。为了对 MIrROR 进行基准测试,我们使用模拟群落和模拟数据集与公开可用的数据库和工具进行了性能比较。我们的平台在覆盖的物种数量和分类准确性方面表现出了有希望的结果。为了鼓励在该领域积极进行 16S-ITS-23S rRNA 操纵子分析,我们提供了基于 BLAST 功能和已在 NCBI 上作为 BioProject 报告的 16S-ITS-23S rRNA 操纵子研究的分类分析结果。MIrROR(http://mirror.egnome.co.kr/)将成为希望使用牛津纳米孔技术的 MinION 等经济高效的测序仪进行高分辨率宏基因组分析的研究人员的有用平台。 代谢条形码是一种通过扩增特定基因标记区域以经济高效的方式研究群落多样性的强大工具。随着长读测序技术的进步,代谢条形码领域已经进入了一个新阶段。该技术带来了对几个领域的发展需求,包括长读可以覆盖的新标记物、标记物数据库、反映长读特征的工具以及与下游分析工具的兼容性。通过构建 MIrROR,我们满足了对 16S-ITS-23S rRNA 操纵子区域的数据库和工具的需求,该区域最近在物种水平上显示出足够的分辨率。使用 MIrROR 的 16S-ITS-23S rRNA 操纵子区域进行细菌群落分析将为各个研究领域提供新的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f372/9045266/d7a106995408/spectrum.02017-21-f001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索