• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

伍兹:一种快速且准确的基因组和宏基因组序列功能注释器及分类器。

Woods: A fast and accurate functional annotator and classifier of genomic and metagenomic sequences.

作者信息

Sharma Ashok K, Gupta Ankit, Kumar Sanjiv, Dhakan Darshan B, Sharma Vineet K

机构信息

MetaInformatics Laboratory, Metagenomics and Systems Biology Group, Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal, Madhya Pradesh, India.

出版信息

Genomics. 2015 Jul;106(1):1-6. doi: 10.1016/j.ygeno.2015.04.001. Epub 2015 Apr 8.

DOI:10.1016/j.ygeno.2015.04.001
PMID:25863333
Abstract

Functional annotation of the gigantic metagenomic data is one of the major time-consuming and computationally demanding tasks, which is currently a bottleneck for the efficient analysis. The commonly used homology-based methods to functionally annotate and classify proteins are extremely slow. Therefore, to achieve faster and accurate functional annotation, we have developed an orthology-based functional classifier 'Woods' by using a combination of machine learning and similarity-based approaches. Woods displayed a precision of 98.79% on independent genomic dataset, 96.66% on simulated metagenomic dataset and >97% on two real metagenomic datasets. In addition, it performed >87 times faster than BLAST on the two real metagenomic datasets. Woods can be used as a highly efficient and accurate classifier with high-throughput capability which facilitates its usability on large metagenomic datasets.

摘要

对海量宏基因组数据进行功能注释是一项耗时且计算量极大的主要任务,目前这是高效分析的一个瓶颈。常用的基于同源性的蛋白质功能注释和分类方法极其缓慢。因此,为了实现更快、更准确的功能注释,我们通过结合机器学习和基于相似性的方法,开发了一种基于直系同源的功能分类器“Woods”。Woods在独立基因组数据集上的精度为98.79%,在模拟宏基因组数据集上为96.66%,在两个真实宏基因组数据集上大于97%。此外,在两个真实宏基因组数据集上,它的运行速度比BLAST快87倍以上。Woods可作为一种具有高通量能力的高效、准确的分类器,便于在大型宏基因组数据集上使用。

相似文献

1
Woods: A fast and accurate functional annotator and classifier of genomic and metagenomic sequences.伍兹:一种快速且准确的基因组和宏基因组序列功能注释器及分类器。
Genomics. 2015 Jul;106(1):1-6. doi: 10.1016/j.ygeno.2015.04.001. Epub 2015 Apr 8.
2
COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets.认知器:宏基因组数据集功能注释框架
PLoS One. 2015 Nov 11;10(11):e0142102. doi: 10.1371/journal.pone.0142102. eCollection 2015.
3
GHOSTX: A Fast Sequence Homology Search Tool for Functional Annotation of Metagenomic Data.GHOSTX:一种用于宏基因组数据功能注释的快速序列同源性搜索工具。
Methods Mol Biol. 2017;1611:15-25. doi: 10.1007/978-1-4939-7015-5_2.
4
Construction of customized sub-databases from NCBI-nr database for rapid annotation of huge metagenomic datasets using a combined BLAST and MEGAN approach.利用组合 BLAST 和 MEGAN 方法从 NCBI-nr 数据库构建定制子数据库,快速注释大量宏基因组数据集。
PLoS One. 2013;8(4):e59831. doi: 10.1371/journal.pone.0059831. Epub 2013 Apr 1.
5
MP4: a machine learning based classification tool for prediction and functional annotation of pathogenic proteins from metagenomic and genomic datasets.MP4:一种基于机器学习的分类工具,用于从宏基因组和基因组数据集中预测和功能注释致病蛋白。
BMC Bioinformatics. 2022 Nov 28;23(1):507. doi: 10.1186/s12859-022-05061-7.
6
Evaluation of a hybrid approach using UBLAST and BLASTX for metagenomic sequences annotation of specific functional genes.使用UBLAST和BLASTX的混合方法对特定功能基因的宏基因组序列进行注释的评估。
PLoS One. 2014 Oct 27;9(10):e110947. doi: 10.1371/journal.pone.0110947. eCollection 2014.
7
From Gene Annotation to Function Prediction for Metagenomics.从宏基因组学的基因注释到功能预测
Methods Mol Biol. 2017;1611:27-34. doi: 10.1007/978-1-4939-7015-5_3.
8
A multi-source domain annotation pipeline for quantitative metagenomic and metatranscriptomic functional profiling.用于定量宏基因组和宏转录组功能分析的多源域注释管道。
Microbiome. 2018 Aug 28;6(1):149. doi: 10.1186/s40168-018-0532-2.
9
GRASPx: efficient homolog-search of short peptide metagenome database through simultaneous alignment and assembly.GRASPx:通过同时比对和组装实现短肽宏基因组数据库的高效同源搜索
BMC Bioinformatics. 2016 Aug 31;17 Suppl 8(Suppl 8):283. doi: 10.1186/s12859-016-1119-1.
10
MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.MetaCAA:一种用于宏基因组数据集高效组装的聚类辅助方法。
Genomics. 2014 Feb-Mar;103(2-3):161-8. doi: 10.1016/j.ygeno.2014.02.007. Epub 2014 Mar 5.

引用本文的文献

1
MetaFunc: taxonomic and functional analyses of high throughput sequencing for microbiomes.MetaFunc:微生物群落高通量测序的分类学和功能分析
Gut Microbiome (Camb). 2023 Jan 12;4:e4. doi: 10.1017/gmb.2022.12. eCollection 2023.
2
Genome sequencing and functional analysis of a multipurpose medicinal herb Tinospora cordifolia (Giloy).基因组测序和多功能药用植物三叶鬼针草(印度人参)的功能分析。
Sci Rep. 2024 Feb 2;14(1):2799. doi: 10.1038/s41598-024-53176-z.
3
Metagenomic exploration of Andaman region of the Indian Ocean.印度洋安达曼地区的宏基因组学探索。
Sci Rep. 2024 Feb 1;14(1):2717. doi: 10.1038/s41598-024-53190-1.
4
Application of artificial intelligence approaches to predict the metabolism of xenobiotic molecules by human gut microbiome.应用人工智能方法预测人类肠道微生物群对外源生物分子的代谢。
Front Microbiol. 2023 Dec 5;14:1254073. doi: 10.3389/fmicb.2023.1254073. eCollection 2023.
5
Evolution of Diagnostic and Forensic Microbiology in the Era of Artificial Intelligence.人工智能时代诊断与法医微生物学的发展
Cureus. 2023 Sep 21;15(9):e45738. doi: 10.7759/cureus.45738. eCollection 2023 Sep.
6
Artificial Intelligence: A Promising Tool in Exploring the Phytomicrobiome in Managing Disease and Promoting Plant Health.人工智能:探索植物微生物组以管理疾病和促进植物健康的一种有前景的工具。
Plants (Basel). 2023 Apr 30;12(9):1852. doi: 10.3390/plants12091852.
7
MP4: a machine learning based classification tool for prediction and functional annotation of pathogenic proteins from metagenomic and genomic datasets.MP4:一种基于机器学习的分类工具,用于从宏基因组和基因组数据集中预测和功能注释致病蛋白。
BMC Bioinformatics. 2022 Nov 28;23(1):507. doi: 10.1186/s12859-022-05061-7.
8
Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation.机器学习与深度学习在宏基因组分类学和功能注释中的应用
Front Microbiol. 2022 Mar 14;13:811495. doi: 10.3389/fmicb.2022.811495. eCollection 2022.
9
Music of metagenomics-a review of its applications, analysis pipeline, and associated tools.宏基因组学音乐——应用、分析流程及其相关工具的综述。
Funct Integr Genomics. 2022 Feb;22(1):3-26. doi: 10.1007/s10142-021-00810-y. Epub 2021 Oct 18.
10
K-Nearest Neighbor and Random Forest-Based Prediction of Putative Tyrosinase Inhibitory Peptides of Abalone .基于 K-最近邻和随机森林的鲍鱼酪氨酸酶抑制肽的预测。
Molecules. 2021 Jun 16;26(12):3671. doi: 10.3390/molecules26123671.