Suppr超能文献

从大规模总 RNA 元转录组数据中重建核糖体基因。

Reconstructing ribosomal genes from large scale total RNA meta-transcriptomic data.

机构信息

Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.

AZTI-Tecnalia, Herrera Kaia, 20110 Pasaia, Spain.

出版信息

Bioinformatics. 2020 Jun 1;36(11):3365-3371. doi: 10.1093/bioinformatics/btaa177.

Abstract

MOTIVATION

Technological advances in meta-transcriptomics have enabled a deeper understanding of the structure and function of microbial communities. 'Total RNA' meta-transcriptomics, sequencing of total reverse transcribed RNA, provides a unique opportunity to investigate both the structure and function of active microbial communities from all three domains of life simultaneously. A major step of this approach is the reconstruction of full-length taxonomic marker genes such as the small subunit ribosomal RNA. However, current tools for this purpose are mainly targeted towards analysis of amplicon and metagenomic data and thus lack the ability to handle the massive and complex datasets typically resulting from total RNA experiments.

RESULTS

In this work, we introduce MetaRib, a new tool for reconstructing ribosomal gene sequences from total RNA meta-transcriptomic data. MetaRib is based on the popular rRNA assembly program EMIRGE, together with several improvements. We address the challenge posed by large complex datasets by integrating sub-assembly, dereplication and mapping in an iterative approach, with additional post-processing steps. We applied the method to both simulated and real-world datasets. Our results show that MetaRib can deal with larger datasets and recover more rRNA genes, which achieve around 60 times speedup and higher F1 score compared to EMIRGE in simulated datasets. In the real-world dataset, it shows similar trends but recovers more contigs compared with a previous analysis based on random sub-sampling, while enabling the comparison of individual contig abundances across samples for the first time.

AVAILABILITY AND IMPLEMENTATION

The source code of MetaRib is freely available at https://github.com/yxxue/MetaRib.

CONTACT

yaxin.xue@uib.no or Inge.Jonassen@uib.no.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

元转录组学技术的进步使人们能够更深入地了解微生物群落的结构和功能。'总 RNA'元转录组学,即对总反转录 RNA 进行测序,为同时研究生命的三个领域中活跃微生物群落的结构和功能提供了独特的机会。该方法的一个主要步骤是重建全长分类标记基因,如小亚基核糖体 RNA。然而,目前用于此目的的工具主要针对扩增子和宏基因组数据的分析,因此缺乏处理总 RNA 实验通常产生的大量复杂数据集的能力。

结果

在这项工作中,我们引入了 MetaRib,这是一种从总 RNA 元转录组学数据中重建核糖体基因序列的新工具。MetaRib 基于流行的 rRNA 组装程序 EMIRGE,并结合了几项改进。我们通过集成子组装、去重复和映射的迭代方法,以及额外的后处理步骤,解决了由大型复杂数据集带来的挑战。我们将该方法应用于模拟和真实数据集。我们的结果表明,MetaRib 可以处理更大的数据集并恢复更多的 rRNA 基因,与模拟数据集上的 EMIRGE 相比,它的速度提高了约 60 倍,F1 分数更高。在真实数据集上,它显示出类似的趋势,但与基于随机子采样的先前分析相比,它恢复了更多的 contigs,同时首次能够比较单个 contig 在样本之间的丰度。

可用性和实现

MetaRib 的源代码可在 https://github.com/yxxue/MetaRib 上免费获得。

联系人

yaxin.xue@uib.noInge.Jonassen@uib.no

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d090/7267836/021eb8ff39f3/btaa177f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验