• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

快速而简单的基于蛋白质比对的微生物组测序读段中直系同源基因家族组装方法。

Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads.

机构信息

Center for Bioinformatics, University of Tübingen, Sand 14, 72076, Tübingen, Germany.

Life Sciences Institute, National University of Singapore, 28 Medical Drive, Singapore, 117456, Singapore.

出版信息

Microbiome. 2017 Jan 25;5(1):11. doi: 10.1186/s40168-017-0233-2.

DOI:10.1186/s40168-017-0233-2
PMID:28122610
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5267372/
Abstract

BACKGROUND

Microbiome sequencing projects typically collect tens of millions of short reads per sample. Depending on the goals of the project, the short reads can either be subjected to direct sequence analysis or be assembled into longer contigs. The assembly of whole genomes from metagenomic sequencing reads is a very difficult problem. However, for some questions, only specific genes of interest need to be assembled. This is then a gene-centric assembly where the goal is to assemble reads into contigs for a family of orthologous genes.

METHODS

We present a new method for performing gene-centric assembly, called protein-alignment-guided assembly, and provide an implementation in our metagenome analysis tool MEGAN. Genes are assembled on the fly, based on the alignment of all reads against a protein reference database such as NCBI-nr. Specifically, the user selects a gene family based on a classification such as KEGG and all reads binned to that gene family are assembled.

RESULTS

Using published synthetic community metagenome sequencing reads and a set of 41 gene families, we show that the performance of this approach compares favorably with that of full-featured assemblers and that of a recently published HMM-based gene-centric assembler, both in terms of the number of reference genes detected and of the percentage of reference sequence covered.

CONCLUSIONS

Protein-alignment-guided assembly of orthologous gene families complements whole-metagenome assembly in a new and very useful way.

摘要

背景

微生物组测序项目通常每个样本收集数千万条短读段。根据项目的目标,短读段可以直接进行序列分析,也可以组装成长的连续序列。从宏基因组测序读段组装全基因组是一个非常困难的问题。然而,对于某些问题,只需要组装特定的感兴趣的基因。这是一个以基因为中心的组装,目标是将读段组装成一组同源基因的连续序列。

方法

我们提出了一种新的以基因为中心的组装方法,称为基于蛋白质比对的组装,并在我们的宏基因组分析工具 MEGAN 中提供了实现。根据对 NCBI-nr 等蛋白质参考数据库的所有读段的比对,基因在组装过程中实时组装。具体来说,用户基于分类(如 KEGG)选择一个基因家族,然后将所有分配到该基因家族的读段进行组装。

结果

使用已发表的合成群落宏基因组测序读段和一组 41 个基因家族,我们表明,与全功能组装器和最近发布的基于 HMM 的以基因为中心的组装器相比,这种方法在检测到的参考基因数量和参考序列覆盖百分比方面表现良好。

结论

基于蛋白质比对的同源基因家族组装以一种新的非常有用的方式补充了全基因组组装。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/4ef532d78329/40168_2017_233_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/8ad799bf86cf/40168_2017_233_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/bbac55325351/40168_2017_233_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/7ca19774fab8/40168_2017_233_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/fa63535ddcb9/40168_2017_233_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/4ef532d78329/40168_2017_233_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/8ad799bf86cf/40168_2017_233_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/bbac55325351/40168_2017_233_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/7ca19774fab8/40168_2017_233_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/fa63535ddcb9/40168_2017_233_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1df/5267372/4ef532d78329/40168_2017_233_Fig5_HTML.jpg

相似文献

1
Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads.快速而简单的基于蛋白质比对的微生物组测序读段中直系同源基因家族组装方法。
Microbiome. 2017 Jan 25;5(1):11. doi: 10.1186/s40168-017-0233-2.
2
GRASP2: fast and memory-efficient gene-centric assembly and homolog search for metagenomic sequencing data.GRASP2:用于宏基因组测序数据的快速、高效、基于基因的组装和同源搜索。
BMC Bioinformatics. 2019 Jun 6;20(Suppl 11):276. doi: 10.1186/s12859-019-2818-1.
3
Graph mining for next generation sequencing: leveraging the assembly graph for biological insights.用于下一代测序的图挖掘:利用组装图获取生物学见解。
BMC Genomics. 2016 May 6;17:340. doi: 10.1186/s12864-016-2678-2.
4
MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs.MEGAN-LR:新算法允许对宏基因组长读段和 contigs 进行准确的分箱和轻松的交互式探索。
Biol Direct. 2018 Apr 20;13(1):6. doi: 10.1186/s13062-018-0208-7.
5
MEGAN Community Edition - Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data.MEGAN社区版 - 大规模微生物组测序数据的交互式探索与分析
PLoS Comput Biol. 2016 Jun 21;12(6):e1004957. doi: 10.1371/journal.pcbi.1004957. eCollection 2016 Jun.
6
Evaluation of short read metagenomic assembly.短读宏基因组组装评估。
BMC Genomics. 2011;12 Suppl 2(Suppl 2):S8. doi: 10.1186/1471-2164-12-S2-S8. Epub 2011 Jul 27.
7
MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.MetaCAA:一种用于宏基因组数据集高效组装的聚类辅助方法。
Genomics. 2014 Feb-Mar;103(2-3):161-8. doi: 10.1016/j.ygeno.2014.02.007. Epub 2014 Mar 5.
8
A comprehensive investigation of metagenome assembly by linked-read sequencing.基于链接读取测序的宏基因组组装综合研究。
Microbiome. 2020 Nov 11;8(1):156. doi: 10.1186/s40168-020-00929-3.
9
Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes.通过验证的视角看宏基因组组装:评估和提高宏基因组组装基因组质量的最新进展。
Brief Bioinform. 2019 Jul 19;20(4):1140-1150. doi: 10.1093/bib/bbx098.
10
Practical evaluation of 11 de novo assemblers in metagenome assembly.宏基因组组装中11种从头组装程序的实际评估
J Microbiol Methods. 2018 Aug;151:99-105. doi: 10.1016/j.mimet.2018.06.007. Epub 2018 Jun 25.

引用本文的文献

1
NLRP3 inflammasome and gut microbiota-brain axis: a new perspective on white matter injury after intracerebral hemorrhage.NLRP3炎性小体与肠道微生物群-脑轴:脑出血后白质损伤的新视角
Neural Regen Res. 2025 Jan 29;21(1):62-80. doi: 10.4103/NRR.NRR-D-24-00917.
2
Applications of de Bruijn graphs in microbiome research.德布鲁因图在微生物组研究中的应用。
Imeta. 2022 Mar 1;1(1):e4. doi: 10.1002/imt2.4. eCollection 2022 Mar.
3
DIAMOND +  MEGAN Microbiome Analysis.DIAMOND +  MEGAN 微生物组分析。

本文引用的文献

1
MEGAN Community Edition - Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data.MEGAN社区版 - 大规模微生物组测序数据的交互式探索与分析
PLoS Comput Biol. 2016 Jun 21;12(6):e1004957. doi: 10.1371/journal.pcbi.1004957. eCollection 2016 Jun.
2
Xander: employing a novel method for efficient gene-targeted metagenomic assembly.赞德:采用一种新颖的方法实现高效的靶向宏基因组组装。
Microbiome. 2015 Aug 5;3:32. doi: 10.1186/s40168-015-0093-6. eCollection 2015.
3
The InterPro protein families database: the classification resource after 15 years.
Methods Mol Biol. 2023;2649:107-131. doi: 10.1007/978-1-0716-3072-3_6.
4
Integrated gene prediction and peptide assembly of metagenomic sequencing data.宏基因组测序数据的整合基因预测与肽段组装
NAR Genom Bioinform. 2023 Mar 11;5(1):lqad023. doi: 10.1093/nargab/lqad023. eCollection 2023 Mar.
5
Interactive analysis of biosurfactants in fruit-waste fermentation samples using BioSurfDB and MEGAN.使用 BioSurfDB 和 MEGAN 对水果废料发酵样品中的生物表面活性剂进行交互式分析。
Sci Rep. 2022 May 11;12(1):7769. doi: 10.1038/s41598-022-11753-0.
6
Single cell genomics reveals plastid-lacking Picozoa are close relatives of red algae.单细胞基因组学揭示缺乏质体的微体真核生物是红藻的近亲。
Nat Commun. 2021 Nov 17;12(1):6651. doi: 10.1038/s41467-021-26918-0.
7
Performance of Multiple Metagenomics Pipelines in Understanding Microbial Diversity of a Low-Biomass Spacecraft Assembly Facility.多种宏基因组学流程在理解低生物量航天器装配设施微生物多样性方面的表现
Front Microbiol. 2021 Sep 28;12:685254. doi: 10.3389/fmicb.2021.685254. eCollection 2021.
8
MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study.MCRL:使用参考文库将宏基因组压缩为非冗余序列列表,以病毒作为案例研究。
Bioinformatics. 2022 Jan 12;38(3):631-647. doi: 10.1093/bioinformatics/btab703.
9
ORFograph: search for novel insecticidal protein genes in genomic and metagenomic assembly graphs.ORFograph:在基因组和宏基因组组装图中搜索新型杀虫蛋白基因。
Microbiome. 2021 Jun 28;9(1):149. doi: 10.1186/s40168-021-01092-z.
10
Microbiome and Metagenome Analyses of a Closed Habitat during Human Occupation.人类居住期间封闭栖息地的微生物组和宏基因组分析
mSystems. 2020 Jul 28;5(4):e00367-20. doi: 10.1128/mSystems.00367-20.
InterPro蛋白质家族数据库:15年后的分类资源。
Nucleic Acids Res. 2015 Jan;43(Database issue):D213-21. doi: 10.1093/nar/gku1243. Epub 2014 Nov 26.
4
Fast and sensitive protein alignment using DIAMOND.使用 DIAMOND 进行快速灵敏的蛋白质比对。
Nat Methods. 2015 Jan;12(1):59-60. doi: 10.1038/nmeth.3176. Epub 2014 Nov 17.
5
Tackling soil diversity with the assembly of large, complex metagenomes.利用大型复杂宏基因组组装来解决土壤多样性问题。
Proc Natl Acad Sci U S A. 2014 Apr 1;111(13):4904-9. doi: 10.1073/pnas.1402564111. Epub 2014 Mar 14.
6
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).SEED 与利用子系统技术进行快速微生物基因组注释(RAST)。
Nucleic Acids Res. 2014 Jan;42(Database issue):D206-14. doi: 10.1093/nar/gkt1226. Epub 2013 Nov 29.
7
Systematic identification of gene families for use as "markers" for phylogenetic and phylogeny-driven ecological studies of bacteria and archaea and their major subgroups.系统地鉴定基因家族,作为细菌和古菌及其主要亚群的系统发育和系统发育驱动的生态学研究的“标记”。
PLoS One. 2013 Oct 17;8(10):e77033. doi: 10.1371/journal.pone.0077033. eCollection 2013.
8
SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.SOAPdenovo2:一种经验丰富的、内存效率高的短读长从头组装器。
Gigascience. 2012 Dec 27;1(1):18. doi: 10.1186/2047-217X-1-18.
9
Comparative metagenomic and rRNA microbial diversity characterization using archaeal and bacterial synthetic communities.利用古菌和细菌合成群落进行比较宏基因组和 rRNA 微生物多样性特征分析。
Environ Microbiol. 2013 Jun;15(6):1882-99. doi: 10.1111/1462-2920.12086. Epub 2013 Feb 6.
10
MAFFT multiple sequence alignment software version 7: improvements in performance and usability.MAFFT 多序列比对软件版本 7:性能和易用性的改进。
Mol Biol Evol. 2013 Apr;30(4):772-80. doi: 10.1093/molbev/mst010. Epub 2013 Jan 16.