• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用网络匹配算法拼接基因片段可提高宏基因组基因组装质量。

Stitching gene fragments with a network matching algorithm improves gene assembly for metagenomics.

机构信息

School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA.

出版信息

Bioinformatics. 2012 Sep 15;28(18):i363-i369. doi: 10.1093/bioinformatics/bts388.

DOI:10.1093/bioinformatics/bts388
PMID:22962453
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3436815/
Abstract

MOTIVATION

One of the difficulties in metagenomic assembly is that homologous genes from evolutionarily closely related species may behave like repeats and confuse assemblers. As a result, small contigs, each representing a short gene fragment, instead of complete genes, may be reported by an assembler. This further complicates annotation of metagenomic datasets, as annotation tools (such as gene predictors or similarity search tools) typically perform poorly on configs encoding short gene fragments.

RESULTS

We present a novel way of using the de Bruijn graph assembly of metagenomes to improve the assembly of genes. A network matching algorithm is proposed for matching the de Bruijn graph of contigs against reference genes, to derive 'gene paths' in the graph (sequences of contigs containing gene fragments) that have the highest similarities to known genes, allowing gene fragments contained in multiple contigs to be connected to form more complete (or intact) genes. Tests on simulated and real datasets show that our approach (called GeneStitch) is able to significantly improve the assembly of genes from metagenomic sequences, by connecting contigs with the guidance of homologous genes-information that is orthogonal to the sequencing reads. We note that the improvement of gene assembly can be observed even when only distantly related genes are available as the reference. We further propose to use 'gene graphs' to represent the assembly of reads from homologous genes and discuss potential applications of gene graphs to improving functional annotation for metagenomics.

AVAILABILITY

The tools are available as open source for download at http://omics.informatics.indiana.edu/GeneStitch

CONTACT

yye@indiana.edu.

摘要

动机

宏基因组组装的困难之一是,来自进化上密切相关的物种的同源基因可能表现得像重复序列,从而使组装器感到困惑。结果,组装器可能会报告小的 contigs,每个 contig 代表一个短的基因片段,而不是完整的基因。这进一步增加了宏基因组数据集注释的复杂性,因为注释工具(如基因预测器或相似性搜索工具)通常在对编码短基因片段的配置进行注释时表现不佳。

结果

我们提出了一种利用宏基因组的 de Bruijn 图组装来改进基因组装的新方法。提出了一种网络匹配算法,用于将 contigs 的 de Bruijn 图与参考基因进行匹配,以从图中导出与已知基因具有最高相似性的“基因路径”(包含基因片段的 contig 序列),从而将包含在多个 contigs 中的基因片段连接起来,形成更完整(或完整)的基因。在模拟和真实数据集上的测试表明,我们的方法(称为 GeneStitch)能够通过使用同源基因的信息(与测序reads 正交的信息)来指导 contigs 的连接,从而显著改进宏基因组序列中基因的组装。我们注意到,即使只有远缘基因作为参考,也可以观察到基因组装的改进。我们进一步提出使用“基因图”来表示同源基因的 reads 组装,并讨论基因图在改进宏基因组功能注释方面的潜在应用。

可用性

该工具可作为开源软件从 http://omics.informatics.indiana.edu/GeneStitch 下载。

联系方式

yye@indiana.edu。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6649/3436815/9b0905abd915/bts388f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6649/3436815/991c105590f3/bts388f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6649/3436815/85d6e886d5dc/bts388f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6649/3436815/5169666bd678/bts388f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6649/3436815/9b0905abd915/bts388f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6649/3436815/991c105590f3/bts388f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6649/3436815/85d6e886d5dc/bts388f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6649/3436815/5169666bd678/bts388f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6649/3436815/9b0905abd915/bts388f4.jpg

相似文献

1
Stitching gene fragments with a network matching algorithm improves gene assembly for metagenomics.利用网络匹配算法拼接基因片段可提高宏基因组基因组装质量。
Bioinformatics. 2012 Sep 15;28(18):i363-i369. doi: 10.1093/bioinformatics/bts388.
2
Utilizing de Bruijn graph of metagenome assembly for metatranscriptome analysis.利用宏基因组组装的德布鲁因图进行宏转录组分析。
Bioinformatics. 2016 Apr 1;32(7):1001-8. doi: 10.1093/bioinformatics/btv510. Epub 2015 Aug 29.
3
MegaGTA: a sensitive and accurate metagenomic gene-targeted assembler using iterative de Bruijn graphs.MegaGTA:一种使用迭代德布鲁因图的灵敏且准确的宏基因组基因靶向组装器。
BMC Bioinformatics. 2017 Oct 16;18(Suppl 12):408. doi: 10.1186/s12859-017-1825-3.
4
GraphBin: refined binning of metagenomic contigs using assembly graphs.GraphBin:使用组装图对宏基因组序列进行精细化分箱。
Bioinformatics. 2020 Jun 1;36(11):3307-3313. doi: 10.1093/bioinformatics/btaa180.
5
Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.病毒宏基因组组装中的碎片化和覆盖度变化,及其对多样性计算的影响。
Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.
6
Evaluation of short read metagenomic assembly.短读宏基因组组装评估。
BMC Genomics. 2011;12 Suppl 2(Suppl 2):S8. doi: 10.1186/1471-2164-12-S2-S8. Epub 2011 Jul 27.
7
Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut.比较不同的组装和注释工具在分析肠道中模拟病毒宏基因组群落中的应用。
BMC Genomics. 2014 Jan 18;15:37. doi: 10.1186/1471-2164-15-37.
8
MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads.MetaVelvet:Velvet 组装器的扩展,用于从短序列读取进行从头宏基因组组装。
Nucleic Acids Res. 2012 Nov 1;40(20):e155. doi: 10.1093/nar/gks678. Epub 2012 Jul 19.
9
Accurate Binning of Metagenomic Contigs Using Composition, Coverage, and Assembly Graphs.基于组成、覆盖度和组装图对宏基因组序列进行精确分箱。
J Comput Biol. 2022 Dec;29(12):1357-1376. doi: 10.1089/cmb.2022.0262. Epub 2022 Nov 11.
10
METAMVGL: a multi-view graph-based metagenomic contig binning algorithm by integrating assembly and paired-end graphs.METAMVGL:一种基于多视图图的宏基因组序列拼接 bin 算法,通过整合组装图和配对末端图。
BMC Bioinformatics. 2021 Jul 22;22(Suppl 10):378. doi: 10.1186/s12859-021-04284-4.

引用本文的文献

1
Music of metagenomics-a review of its applications, analysis pipeline, and associated tools.宏基因组学音乐——应用、分析流程及其相关工具的综述。
Funct Integr Genomics. 2022 Feb;22(1):3-26. doi: 10.1007/s10142-021-00810-y. Epub 2021 Oct 18.
2
ORFograph: search for novel insecticidal protein genes in genomic and metagenomic assembly graphs.ORFograph:在基因组和宏基因组组装图中搜索新型杀虫蛋白基因。
Microbiome. 2021 Jun 28;9(1):149. doi: 10.1186/s40168-021-01092-z.
3
Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus.

本文引用的文献

1
How to apply de Bruijn graphs to genome assembly.如何将德布鲁因图应用于基因组组装。
Nat Biotechnol. 2011 Nov 8;29(11):987-91. doi: 10.1038/nbt.2023.
2
Taxonomic classification of metagenomic shotgun sequences with CARMA3.基于 CARMA3 的宏基因组鸟枪法测序的分类学分类
Nucleic Acids Res. 2011 Aug;39(14):e91. doi: 10.1093/nar/gkr225. Epub 2011 May 17.
3
Full-length transcriptome assembly from RNA-Seq data without a reference genome.无参考基因组的 RNA-Seq 数据的全长转录组组装。
从 RNA-Seq 数据的 de Bruijn 图中鉴定相关生物中的相似转录本,应用于研究草木樨的耐盐和耐涝性。
BMC Genomics. 2019 Jun 6;20(Suppl 5):425. doi: 10.1186/s12864-019-5702-5.
4
Identification and Resolution of Microdiversity through Metagenomic Sequencing of Parallel Consortia.通过平行群落的宏基因组测序鉴定和解决微多样性
Appl Environ Microbiol. 2015 Oct 23;82(1):255-67. doi: 10.1128/AEM.02274-15. Print 2016 Jan 1.
5
Utilizing de Bruijn graph of metagenome assembly for metatranscriptome analysis.利用宏基因组组装的德布鲁因图进行宏转录组分析。
Bioinformatics. 2016 Apr 1;32(7):1001-8. doi: 10.1093/bioinformatics/btv510. Epub 2015 Aug 29.
6
Reconstructing 16S rRNA genes in metagenomic data.重建宏基因组数据中的 16S rRNA 基因。
Bioinformatics. 2015 Jun 15;31(12):i35-43. doi: 10.1093/bioinformatics/btv231.
7
A scalable and accurate targeted gene assembly tool (SAT-Assembler) for next-generation sequencing data.一种用于下一代测序数据的可扩展且准确的靶向基因组装工具(SAT组装器)。
PLoS Comput Biol. 2014 Aug 14;10(8):e1003737. doi: 10.1371/journal.pcbi.1003737. eCollection 2014 Aug.
Nat Biotechnol. 2011 May 15;29(7):644-52. doi: 10.1038/nbt.1883.
4
Metagenomic discovery of biomass-degrading genes and genomes from cow rumen.从牛瘤胃中发现生物量降解基因和基因组的宏基因组学研究。
Science. 2011 Jan 28;331(6016):463-7. doi: 10.1126/science.1200387.
5
FragGeneScan: predicting genes in short and error-prone reads.FragGeneScan:预测短读和易错读中的基因。
Nucleic Acids Res. 2010 Nov;38(20):e191. doi: 10.1093/nar/gkq747. Epub 2010 Aug 30.
6
Metagenomic sequencing of an in vitro-simulated microbial community.微生物群落体外模拟的宏基因组测序。
PLoS One. 2010 Apr 16;5(4):e10209. doi: 10.1371/journal.pone.0010209.
7
A human gut microbial gene catalogue established by metagenomic sequencing.宏基因组测序建立的人类肠道微生物基因目录。
Nature. 2010 Mar 4;464(7285):59-65. doi: 10.1038/nature08821.
8
Fast and accurate long-read alignment with Burrows-Wheeler transform.基于 Burrows-Wheeler 变换的快速准确长读比对。
Bioinformatics. 2010 Mar 1;26(5):589-95. doi: 10.1093/bioinformatics/btp698. Epub 2010 Jan 15.
9
De novo assembly of human genomes with massively parallel short read sequencing.利用大规模平行短读测序进行人类基因组从头组装。
Genome Res. 2010 Feb;20(2):265-72. doi: 10.1101/gr.097261.109. Epub 2009 Dec 17.
10
The effect of sequencing errors on metagenomic gene prediction.测序错误对宏基因组基因预测的影响。
BMC Genomics. 2009 Nov 12;10:520. doi: 10.1186/1471-2164-10-520.