• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

螳螂:灵活且基于共识的基因组注释。

Mantis: flexible and consensus-driven genome annotation.

机构信息

Systems Ecology, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg.

Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg.

出版信息

Gigascience. 2021 Jun 2;10(6). doi: 10.1093/gigascience/giab042.

DOI:10.1093/gigascience/giab042
PMID:34076241
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8170692/
Abstract

BACKGROUND

The rapid development of the (meta-)omics fields has produced an unprecedented amount of high-resolution and high-fidelity data. Through the use of these datasets we can infer the role of previously functionally unannotated proteins from single organisms and consortia. In this context, protein function annotation can be described as the identification of regions of interest (i.e., domains) in protein sequences and the assignment of biological functions. Despite the existence of numerous tools, challenges remain in terms of speed, flexibility, and reproducibility. In the big data era, it is also increasingly important to cease limiting our findings to a single reference, coalescing knowledge from different data sources, and thus overcoming some limitations in overly relying on computationally generated data from single sources.

RESULTS

We implemented a protein annotation tool, Mantis, which uses database identifiers intersection and text mining to integrate knowledge from multiple reference data sources into a single consensus-driven output. Mantis is flexible, allowing for the customization of reference data and execution parameters, and is reproducible across different research goals and user environments. We implemented a depth-first search algorithm for domain-specific annotation, which significantly improved annotation performance compared to sequence-wide annotation. The parallelized implementation of Mantis results in short runtimes while also outputting high coverage and high-quality protein function annotations.

CONCLUSIONS

Mantis is a protein function annotation tool that produces high-quality consensus-driven protein annotations. It is easy to set up, customize, and use, scaling from single genomes to large metagenomes. Mantis is available under the MIT license at https://github.com/PedroMTQ/mantis.

摘要

背景

(宏)基因组学领域的快速发展产生了前所未有的大量高分辨率、高保真度数据。通过使用这些数据集,我们可以从单个生物和生物群落中推断出以前功能未注释的蛋白质的作用。在这种情况下,蛋白质功能注释可以描述为鉴定蛋白质序列中的感兴趣区域(即域),并分配生物学功能。尽管存在许多工具,但在速度、灵活性和可重复性方面仍然存在挑战。在大数据时代,我们也越来越需要停止将我们的发现仅限于单个参考,将来自不同数据源的知识汇聚起来,从而克服过度依赖来自单一来源的计算生成数据的一些限制。

结果

我们实现了一个蛋白质注释工具 Mantis,它使用数据库标识符交集和文本挖掘将来自多个参考数据源的知识集成到单个共识驱动的输出中。Mantis 具有灵活性,允许定制参考数据和执行参数,并且在不同的研究目标和用户环境中具有可重复性。我们实现了一种针对特定领域的注释的深度优先搜索算法,与全序列注释相比,显著提高了注释性能。Mantis 的并行实现导致运行时间短,同时输出高质量和高覆盖率的蛋白质功能注释。

结论

Mantis 是一种蛋白质功能注释工具,可生成高质量的共识驱动蛋白质注释。它易于设置、定制和使用,可扩展到从单个基因组到大型宏基因组。Mantis 可在 MIT 许可证下在 https://github.com/PedroMTQ/mantis 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/6789abaf23c8/giab042fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/2060ff8a7f3f/giab042fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/b76c696f4c4f/giab042fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/8b7bb0709321/giab042fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/23dd90c31d00/giab042fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/141c82b6fb09/giab042fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/f8dc8b79eed3/giab042fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/6789abaf23c8/giab042fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/2060ff8a7f3f/giab042fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/b76c696f4c4f/giab042fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/8b7bb0709321/giab042fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/23dd90c31d00/giab042fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/141c82b6fb09/giab042fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/f8dc8b79eed3/giab042fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/6789abaf23c8/giab042fig7.jpg

相似文献

1
Mantis: flexible and consensus-driven genome annotation.螳螂:灵活且基于共识的基因组注释。
Gigascience. 2021 Jun 2;10(6). doi: 10.1093/gigascience/giab042.
2
COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets.认知器:宏基因组数据集功能注释框架
PLoS One. 2015 Nov 11;10(11):e0142102. doi: 10.1371/journal.pone.0142102. eCollection 2015.
3
Comprehensive Functional Annotation of Metagenomes and Microbial Genomes Using a Deep Learning-Based Method.基于深度学习的宏基因组和微生物组综合功能注释。
mSystems. 2023 Apr 27;8(2):e0117822. doi: 10.1128/msystems.01178-22. Epub 2023 Mar 7.
4
MicrobeAnnotator: a user-friendly, comprehensive functional annotation pipeline for microbial genomes.微生物注释器:一个用户友好、全面的微生物基因组功能注释管道。
BMC Bioinformatics. 2021 Jan 6;22(1):11. doi: 10.1186/s12859-020-03940-5.
5
Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification.Bakta:通过无比对序列鉴定实现细菌基因组的快速标准化注释。
Microb Genom. 2021 Nov;7(11). doi: 10.1099/mgen.0.000685.
6
StandEnA: a customizable workflow for standardized annotation and generating a presence-absence matrix of proteins.StandEnA:一种用于标准化注释和生成蛋白质存在-缺失矩阵的可定制工作流程。
Bioinform Adv. 2023 Jun 9;3(1):vbad069. doi: 10.1093/bioadv/vbad069. eCollection 2023.
7
VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data.VESPA:通过整合蛋白质组学和转录组学数据,为原核生物的基因组注释提供便利的软件。
BMC Genomics. 2012 Apr 5;13:131. doi: 10.1186/1471-2164-13-131.
8
ggcoverage: an R package to visualize and annotate genome coverage for various NGS data.ggcoverage:一个用于可视化和注释各种 NGS 数据基因组覆盖度的 R 包。
BMC Bioinformatics. 2023 Aug 9;24(1):309. doi: 10.1186/s12859-023-05438-2.
9
Unification of functional annotation descriptions using text mining.使用文本挖掘实现功能注释描述的统一。
Biol Chem. 2021 May 13;402(8):983-990. doi: 10.1515/hsz-2021-0125. Print 2021 Jul 27.
10
ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data.ATLAS:用于宏基因组序列数据组装、注释和基因组分箱的 SnakeMake 工作流程。
BMC Bioinformatics. 2020 Jun 22;21(1):257. doi: 10.1186/s12859-020-03585-4.

引用本文的文献

1
Human gut microbiome gene co-expression network reveals a loss in taxonomic and functional diversity in Parkinson's disease.人类肠道微生物群基因共表达网络揭示帕金森病中分类学和功能多样性的丧失。
NPJ Biofilms Microbiomes. 2025 Jul 24;11(1):142. doi: 10.1038/s41522-025-00780-0.
2
The microbiologist's guide to metaproteomics.微生物学家的宏蛋白质组学指南。
Imeta. 2025 May 6;4(3):e70031. doi: 10.1002/imt2.70031. eCollection 2025 Jun.
3
Lineage-specific microbial protein prediction enables large-scale exploration of protein ecology within the human gut.

本文引用的文献

1
Improved characterisation of clinical text through ontology-based vocabulary expansion.通过基于本体的词汇扩展来改善临床文本的特征描述。
J Biomed Semantics. 2021 Apr 12;12(1):7. doi: 10.1186/s13326-021-00241-5.
2
Optimised biomolecular extraction for metagenomic analysis of microbial biofilms from high-mountain streams.用于高山溪流微生物生物膜宏基因组分析的优化生物分子提取方法
PeerJ. 2020 Oct 27;8:e9973. doi: 10.7717/peerj.9973. eCollection 2020.
3
Integration of absolute multi-omics reveals dynamic protein-to-RNA ratios and metabolic interplay within mixed-domain microbiomes.
谱系特异性微生物蛋白预测能够大规模探索人类肠道内的蛋白质生态学。
Nat Commun. 2025 Apr 3;16(1):3204. doi: 10.1038/s41467-025-58442-w.
4
Dietary protein source alters gut microbiota composition and function.膳食蛋白质来源会改变肠道微生物群的组成和功能。
ISME J. 2025 Jan 2;19(1). doi: 10.1093/ismejo/wraf048.
5
Diversity and biogeography of the bacterial microbiome in glacier-fed streams.冰川补给溪流中细菌微生物群落的多样性与生物地理学
Nature. 2025 Jan;637(8046):622-630. doi: 10.1038/s41586-024-08313-z. Epub 2025 Jan 1.
6
Microbial communities reveal niche partitioning across the slope and bottom zones of the challenger deep.微生物群落揭示了挑战者深渊斜坡和底部区域的生态位划分。
Environ Microbiol Rep. 2024 Aug;16(4):e13314. doi: 10.1111/1758-2229.13314.
7
Functional prediction of proteins from the human gut archaeome.来自人类肠道古菌组的蛋白质功能预测
ISME Commun. 2024 Jan 10;4(1):ycad014. doi: 10.1093/ismeco/ycad014. eCollection 2024 Jan.
8
A toolbox of machine learning software to support microbiome analysis.一个支持微生物组分析的机器学习软件工具箱。
Front Microbiol. 2023 Nov 22;14:1250806. doi: 10.3389/fmicb.2023.1250806. eCollection 2023.
9
Forecasting the dynamics of a complex microbial community using integrated meta-omics.利用整合宏基因组学预测复杂微生物群落的动态变化。
Nat Ecol Evol. 2024 Jan;8(1):32-44. doi: 10.1038/s41559-023-02241-3. Epub 2023 Nov 13.
10
The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics.GEN-ERA 工具包:用于微生物基因组学研究的统一且可重复的工作流程。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad022. Epub 2023 Apr 10.
绝对多组学整合揭示了混合域微生物组内动态的蛋白质与 RNA 比值和代谢相互作用。
Nat Commun. 2020 Sep 18;11(1):4708. doi: 10.1038/s41467-020-18543-0.
4
A complete domain-to-species taxonomy for Bacteria and Archaea.细菌和古菌的完整域到种分类 taxonomy。
Nat Biotechnol. 2020 Sep;38(9):1079-1086. doi: 10.1038/s41587-020-0501-8. Epub 2020 Apr 27.
5
Pre- and post-sequencing recommendations for functional annotation of human fecal metagenomes.人类粪便宏基因组功能注释的测序前和测序后建议。
BMC Bioinformatics. 2020 Feb 24;21(1):74. doi: 10.1186/s12859-020-3416-y.
6
MADOKA: an ultra-fast approach for large-scale protein structure similarity searching.MADOKA:一种用于大规模蛋白质结构相似性搜索的超快速方法。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 19):662. doi: 10.1186/s12859-019-3235-1.
7
CDD/SPARCLE: the conserved domain database in 2020.CDD/SPARCLE:2020 年的保守结构域数据库。
Nucleic Acids Res. 2020 Jan 8;48(D1):D265-D268. doi: 10.1093/nar/gkz991.
8
KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold.KOFA-MKOALA:基于轮廓 HMM 和自适应得分阈值的 KEGG 直系同源物分配。
Bioinformatics. 2020 Apr 1;36(7):2251-2252. doi: 10.1093/bioinformatics/btz859.
9
MGnify: the microbiome analysis resource in 2020.MGnify:2020 年的微生物组分析资源。
Nucleic Acids Res. 2020 Jan 8;48(D1):D570-D578. doi: 10.1093/nar/gkz1035.
10
HH-suite3 for fast remote homology detection and deep protein annotation.HH-suite3 用于快速远程同源检测和深度蛋白质注释。
BMC Bioinformatics. 2019 Sep 14;20(1):473. doi: 10.1186/s12859-019-3019-7.