Suppr超能文献

螳螂:灵活且基于共识的基因组注释。

Mantis: flexible and consensus-driven genome annotation.

机构信息

Systems Ecology, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg.

Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg.

出版信息

Gigascience. 2021 Jun 2;10(6). doi: 10.1093/gigascience/giab042.

Abstract

BACKGROUND

The rapid development of the (meta-)omics fields has produced an unprecedented amount of high-resolution and high-fidelity data. Through the use of these datasets we can infer the role of previously functionally unannotated proteins from single organisms and consortia. In this context, protein function annotation can be described as the identification of regions of interest (i.e., domains) in protein sequences and the assignment of biological functions. Despite the existence of numerous tools, challenges remain in terms of speed, flexibility, and reproducibility. In the big data era, it is also increasingly important to cease limiting our findings to a single reference, coalescing knowledge from different data sources, and thus overcoming some limitations in overly relying on computationally generated data from single sources.

RESULTS

We implemented a protein annotation tool, Mantis, which uses database identifiers intersection and text mining to integrate knowledge from multiple reference data sources into a single consensus-driven output. Mantis is flexible, allowing for the customization of reference data and execution parameters, and is reproducible across different research goals and user environments. We implemented a depth-first search algorithm for domain-specific annotation, which significantly improved annotation performance compared to sequence-wide annotation. The parallelized implementation of Mantis results in short runtimes while also outputting high coverage and high-quality protein function annotations.

CONCLUSIONS

Mantis is a protein function annotation tool that produces high-quality consensus-driven protein annotations. It is easy to set up, customize, and use, scaling from single genomes to large metagenomes. Mantis is available under the MIT license at https://github.com/PedroMTQ/mantis.

摘要

背景

(宏)基因组学领域的快速发展产生了前所未有的大量高分辨率、高保真度数据。通过使用这些数据集,我们可以从单个生物和生物群落中推断出以前功能未注释的蛋白质的作用。在这种情况下,蛋白质功能注释可以描述为鉴定蛋白质序列中的感兴趣区域(即域),并分配生物学功能。尽管存在许多工具,但在速度、灵活性和可重复性方面仍然存在挑战。在大数据时代,我们也越来越需要停止将我们的发现仅限于单个参考,将来自不同数据源的知识汇聚起来,从而克服过度依赖来自单一来源的计算生成数据的一些限制。

结果

我们实现了一个蛋白质注释工具 Mantis,它使用数据库标识符交集和文本挖掘将来自多个参考数据源的知识集成到单个共识驱动的输出中。Mantis 具有灵活性,允许定制参考数据和执行参数,并且在不同的研究目标和用户环境中具有可重复性。我们实现了一种针对特定领域的注释的深度优先搜索算法,与全序列注释相比,显著提高了注释性能。Mantis 的并行实现导致运行时间短,同时输出高质量和高覆盖率的蛋白质功能注释。

结论

Mantis 是一种蛋白质功能注释工具,可生成高质量的共识驱动蛋白质注释。它易于设置、定制和使用,可扩展到从单个基因组到大型宏基因组。Mantis 可在 MIT 许可证下在 https://github.com/PedroMTQ/mantis 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dddb/8170692/2060ff8a7f3f/giab042fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验