• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于氨基酸的 de Bruijn 图算法,用于从宏基因组和宏转录组短读序列中鉴定完整的编码基因。

Amino acid based de Bruijn graph algorithm for identifying complete coding genes from metagenomic and metatranscriptomic short reads.

机构信息

State key Laboratory of Genetic Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200433, China.

The T-Life Research Center, Fudan University, Shanghai 200433, China.

出版信息

Nucleic Acids Res. 2019 Mar 18;47(5):e30. doi: 10.1093/nar/gkz017.

DOI:10.1093/nar/gkz017
PMID:30657979
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6412133/
Abstract

Metagenomic studies, greatly promoted by the fast development of next-generation sequencing (NGS) technologies, uncover complex structures of microbial communities and their interactions with environment. As the majority of microbes lack information of genome sequences, it is essential to assemble prokaryotic genomes ab initio aiming to retrieve complete coding genes from various metabolic pathways. The complex nature of microbial composition and the burden of handling a vast amount of metagenomic data, bring great challenges to the development of effective and efficient bioinformatic tools. Here we present a protein assembler (MetaPA), based on de Bruijn graph searching on oligopeptide spaces and can be applied on both metagenomic and metatranscriptomic sequencing data. When public homologous protein sequences are involved to guide the assembling procedures, MetaPA assembles 85% of total proteins in complete sequences with high precision of 83% on real high-throughput sequencing datasets. Application of MetaPA on metatranscriptomic data successfully identifies the majority of actively transcribed genes validated in related studies. The results suggest that MetaPA has a good potential in both metagenomic and metatranscriptomic studies to characterize the composition and abundance of microbiota.

摘要

宏基因组学研究极大地促进了下一代测序(NGS)技术的发展,揭示了微生物群落的复杂结构及其与环境的相互作用。由于大多数微生物缺乏基因组序列信息,因此必须从头组装原核基因组,以便从各种代谢途径中检索完整的编码基因。微生物组成的复杂性和大量宏基因组数据处理的负担给有效和高效的生物信息学工具的发展带来了巨大的挑战。在这里,我们提出了一种基于寡肽空间的 de Bruijn 图搜索的蛋白质组装器(MetaPA),可应用于宏基因组和宏转录组测序数据。当涉及到公共同源蛋白序列来指导组装过程时,MetaPA 可以在真实的高通量测序数据集中以 83%的高精度组装 85%的完整序列中的总蛋白。MetaPA 在宏转录组数据上的应用成功地识别了相关研究中验证的大多数活跃转录基因。结果表明,MetaPA 在宏基因组学和宏转录组学研究中具有很好的潜力,可以描述微生物群落的组成和丰度。

相似文献

1
Amino acid based de Bruijn graph algorithm for identifying complete coding genes from metagenomic and metatranscriptomic short reads.基于氨基酸的 de Bruijn 图算法,用于从宏基因组和宏转录组短读序列中鉴定完整的编码基因。
Nucleic Acids Res. 2019 Mar 18;47(5):e30. doi: 10.1093/nar/gkz017.
2
SPA: a short peptide assembler for metagenomic data.SPA:一种用于宏基因组数据的短肽组装工具。
Nucleic Acids Res. 2013 Apr;41(8):e91. doi: 10.1093/nar/gkt118. Epub 2013 Feb 23.
3
Functional dynamics of bacterial species in the mouse gut microbiome revealed by metagenomic and metatranscriptomic analyses.基于宏基因组和宏转录组分析揭示的小鼠肠道微生物组中细菌物种的功能动态。
PLoS One. 2020 Jan 24;15(1):e0227886. doi: 10.1371/journal.pone.0227886. eCollection 2020.
4
InteMAP: Integrated metagenomic assembly pipeline for NGS short reads.InteMAP:用于NGS短读长的综合宏基因组组装流程
BMC Bioinformatics. 2015 Aug 7;16:244. doi: 10.1186/s12859-015-0686-x.
5
Intestinal microbiota domination under extreme selective pressures characterized by metagenomic read cloud sequencing and assembly.肠道微生物群落在具有宏基因组读段云测序和组装特征的极端选择压力下占主导地位。
BMC Bioinformatics. 2019 Dec 2;20(Suppl 16):585. doi: 10.1186/s12859-019-3073-1.
6
MegaGTA: a sensitive and accurate metagenomic gene-targeted assembler using iterative de Bruijn graphs.MegaGTA:一种使用迭代德布鲁因图的灵敏且准确的宏基因组基因靶向组装器。
BMC Bioinformatics. 2017 Oct 16;18(Suppl 12):408. doi: 10.1186/s12859-017-1825-3.
7
Comparison of metatranscriptomic samples based on k-tuple frequencies.基于k元组频率的宏转录组样本比较。
PLoS One. 2014 Jan 2;9(1):e84348. doi: 10.1371/journal.pone.0084348. eCollection 2014.
8
IDBA-MT: de novo assembler for metatranscriptomic data generated from next-generation sequencing technology.IDBA-MT:用于从新一代测序技术生成的宏转录组数据的从头组装器。
J Comput Biol. 2013 Jul;20(7):540-50. doi: 10.1089/cmb.2013.0042.
9
Assessment of metagenomic assemblers based on hybrid reads of real and simulated metagenomic sequences.基于真实和模拟宏基因组序列混合读取的宏基因组组装器评估。
Brief Bioinform. 2020 May 21;21(3):777-790. doi: 10.1093/bib/bbz025.
10
IDBA-MTP: A Hybrid Metatranscriptomic Assembler Based on Protein Information.IDBA-MTP:一种基于蛋白质信息的混合宏转录组组装器。
J Comput Biol. 2015 May;22(5):367-76. doi: 10.1089/cmb.2014.0139. Epub 2014 Dec 23.

引用本文的文献

1
Applications of de Bruijn graphs in microbiome research.德布鲁因图在微生物组研究中的应用。
Imeta. 2022 Mar 1;1(1):e4. doi: 10.1002/imt2.4. eCollection 2022 Mar.
2
Integrated gene prediction and peptide assembly of metagenomic sequencing data.宏基因组测序数据的整合基因预测与肽段组装
NAR Genom Bioinform. 2023 Mar 11;5(1):lqad023. doi: 10.1093/nargab/lqad023. eCollection 2023 Mar.
3
Computational prediction of secreted proteins in gram-negative bacteria.革兰氏阴性菌中分泌蛋白的计算预测。

本文引用的文献

1
Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life.近 8000 个宏基因组组装基因组的恢复极大地扩展了生命之树。
Nat Microbiol. 2017 Nov;2(11):1533-1542. doi: 10.1038/s41564-017-0012-7. Epub 2017 Sep 11.
2
metaSPAdes: a new versatile metagenomic assembler.metaSPAdes:一种新型通用宏基因组序列拼接软件
Genome Res. 2017 May;27(5):824-834. doi: 10.1101/gr.213959.116. Epub 2017 Mar 15.
3
MetaSort untangles metagenome assembly by reducing microbial community complexity.MetaSort 通过降低微生物群落复杂性来解开宏基因组组装难题。
Comput Struct Biotechnol J. 2021 Mar 22;19:1806-1828. doi: 10.1016/j.csbj.2021.03.019. eCollection 2021.
4
GIMICA: host genetic and immune factors shaping human microbiota.GIMICA:宿主遗传和免疫因素塑造人类微生物组。
Nucleic Acids Res. 2021 Jan 8;49(D1):D715-D722. doi: 10.1093/nar/gkaa851.
Nat Commun. 2017 Jan 23;8:14306. doi: 10.1038/ncomms14306.
4
A novel codon-based de Bruijn graph algorithm for gene construction from unassembled transcriptomes.一种基于密码子的新型德布鲁因图算法,用于从未组装转录组构建基因。
Genome Biol. 2016 Nov 17;17(1):232. doi: 10.1186/s13059-016-1094-x.
5
Complex archaea that bridge the gap between prokaryotes and eukaryotes.连接原核生物和真核生物之间差距的复杂古菌。
Nature. 2015 May 14;521(7551):173-179. doi: 10.1038/nature14447. Epub 2015 May 6.
6
SFA-SPA: a suffix array based short peptide assembler for metagenomic data.SFA-SPA:基于后缀数组的宏基因组数据短肽组装器。
Bioinformatics. 2015 Jun 1;31(11):1833-5. doi: 10.1093/bioinformatics/btv052. Epub 2015 Jan 30.
7
MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph.MEGAHIT:通过简洁的 de Bruijn 图实现的超快速单节点解决方案,适用于大型和复杂的宏基因组组装。
Bioinformatics. 2015 May 15;31(10):1674-6. doi: 10.1093/bioinformatics/btv033. Epub 2015 Jan 20.
8
Fast and sensitive protein alignment using DIAMOND.使用 DIAMOND 进行快速灵敏的蛋白质比对。
Nat Methods. 2015 Jan;12(1):59-60. doi: 10.1038/nmeth.3176. Epub 2014 Nov 17.
9
Trimmomatic: a flexible trimmer for Illumina sequence data.Trimmomatic:一款适用于 Illumina 测序数据的灵活修剪工具。
Bioinformatics. 2014 Aug 1;30(15):2114-20. doi: 10.1093/bioinformatics/btu170. Epub 2014 Apr 1.
10
SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.SOAPdenovo2:一种经验丰富的、内存效率高的短读长从头组装器。
Gigascience. 2012 Dec 27;1(1):18. doi: 10.1186/2047-217X-1-18.