National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.
mSystems. 2024 Mar 19;9(3):e0003624. doi: 10.1128/msystems.00036-24. Epub 2024 Feb 16.
Analyzing microbial genomes has become an essential part of microbiology research, giving valuable insights into the functions and evolution of microbial species. Identifying genes of interest and assigning putative annotations to those genes is a central task in genome analysis, and a plethora of tools and approaches have been developed for this task. The ProkFunFind tool was developed to bridge the gap between these various annotation approaches, providing a flexible and customizable search approach to annotate microbial functions. ProkFunFind is designed around hierarchical definitions of biological functions, where individual genes can be identified using heterogeneous search terms consisting of sequences, profile hidden Markov models, protein domains, and orthology groups. This flexible and customizable search approach allows for searches to be tailored to specific biological functions, and the search results are output in multiple formats to facilitate downstream analyses. The utility of the ProkFunFind search tool was demonstrated through its application in searching for bacterial flagella, which are complex organelles composed of multiple genes. Overall, ProkFunFind provides an accessible and flexible way to integrate multiple types of annotation and sequence data while annotating biological functions in microbial genomes.IMPORTANCEGenome sequencing and analysis are increasingly important parts of microbiology, providing a way to predict metabolic functions, identify virulence factors, and understand the evolution of microbes. The expanded use of genome sequencing has also brought an abundance of search and annotation methods, but integrating the information from these different methods can be challenging and is often done through approaches. To bridge the gap between different types of annotations, we developed ProkFunFind, a flexible and customizable search tool incorporating multiple search approaches and annotation types to annotate microbial functions. We demonstrated the utility of ProkFunFind by searching for gene clusters encoding flagellar genes using a combination of different annotation types and searches. Overall, ProkFunFind provides a reproducible and flexible way to identify gene clusters of interest, facilitating the meaningful analysis of new and existing microbial genomes.
分析微生物基因组已成为微生物学研究的重要组成部分,为了解微生物物种的功能和进化提供了有价值的见解。鉴定感兴趣的基因并为这些基因赋予假定的注释是基因组分析的核心任务,为此已经开发了大量的工具和方法。ProkFunFind 工具旨在弥合这些不同注释方法之间的差距,为注释微生物功能提供了一种灵活且可定制的搜索方法。ProkFunFind 是围绕生物功能的层次定义设计的,其中可以使用由序列、隐马尔可夫模型、蛋白质域和同源群组成的异质搜索词来识别单个基因。这种灵活且可定制的搜索方法允许根据特定的生物学功能进行搜索,并且搜索结果以多种格式输出,以方便下游分析。ProkFunFind 搜索工具的实用性通过在搜索细菌鞭毛中的应用得到了证明,鞭毛是由多个基因组成的复杂细胞器。总的来说,ProkFunFind 提供了一种可访问且灵活的方法,可以在注释微生物基因组中的生物学功能时集成多种类型的注释和序列数据。
重要性
基因组测序和分析越来越成为微生物学的重要组成部分,提供了一种预测代谢功能、识别毒力因子和理解微生物进化的方法。基因组测序的广泛应用也带来了大量的搜索和注释方法,但整合这些不同方法的信息可能具有挑战性,通常通过 方法来完成。为了弥合不同类型注释之间的差距,我们开发了 ProkFunFind,这是一种灵活且可定制的搜索工具,它结合了多种搜索方法和注释类型来注释微生物功能。我们通过使用不同的注释类型和搜索来搜索编码鞭毛基因的基因簇,证明了 ProkFunFind 的实用性。总的来说,ProkFunFind 提供了一种可重复且灵活的方法来识别感兴趣的基因簇,有助于对新的和现有的微生物基因组进行有意义的分析。