Suppr超能文献

显著性阈值的自适应调整在微生物基因注释和代谢洞察方面带来了巨大提升。

Adaptive adjustment of significance thresholds produces large gains in microbial gene annotations and metabolic insights.

作者信息

Kananen Kathryn, Veseli Iva, Quiles Pérez Christian J, Miller Samuel, Eren A Murat, Bradley Patrick H

机构信息

Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA.

Helmholtz Institute for Functional Marine Biodiversity, 26129, Oldenburg, Germany.

出版信息

bioRxiv. 2024 Jul 5:2024.07.03.601779. doi: 10.1101/2024.07.03.601779.

Abstract

Gene function annotations enable microbial ecologists to make inferences about metabolic potential from genomes and metagenomes. However, even tools that use the same database and general approach can differ markedly in the annotations they recover. We compare three popular methods for identifying KEGG Orthologs, applying them to genomes drawn from a range of bacterial families that occupy different host-associated and free-living biomes. Our results show that by adaptively tuning sequence similarity thresholds, sensitivity can be substantially improved while maintaining accuracy. We observe the largest improvements when few reference sequences exist for a given protein family, and when annotating genomes from non-model organisms (such as gut-dwelling Lachnospiraceae). Our results suggest that straightforward heuristic adjustments can broadly improve microbial metabolic predictions.

摘要

基因功能注释使微生物生态学家能够从基因组和宏基因组推断代谢潜力。然而,即使是使用相同数据库和通用方法的工具,它们所获得的注释也可能存在显著差异。我们比较了三种常用的鉴定KEGG直系同源物的方法,并将它们应用于来自一系列占据不同宿主相关和自由生活生物群落的细菌家族的基因组。我们的结果表明,通过自适应调整序列相似性阈值,可以在保持准确性的同时大幅提高灵敏度。当给定蛋白质家族的参考序列较少时,以及在注释非模式生物(如肠道内的毛螺菌科)的基因组时,我们观察到了最大的改进。我们的结果表明,直接的启发式调整可以广泛改善微生物代谢预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89b6/11245035/a59dc01fe605/nihpp-2024.07.03.601779v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验