Environmental Genomics & Systems Biology, Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720, United States.
Database (Oxford). 2024 Sep 6;2024. doi: 10.1093/database/baae089.
Automated annotations of protein functions are error-prone because of our lack of knowledge of protein functions. For example, it is often impossible to predict the correct substrate for an enzyme or a transporter. Furthermore, much of the knowledge that we do have about the functions of proteins is missing from the underlying databases. We discuss how to use interactive tools to quickly find different kinds of information relevant to a protein's function. Many of these tools are available via PaperBLAST (http://papers.genomics.lbl.gov). Combining these tools often allows us to infer a protein's function. Ideally, accurate annotations would allow us to predict a bacterium's capabilities from its genome sequence, but in practice, this remains challenging. We describe interactive tools that infer potential capabilities from a genome sequence or that search a genome to find proteins that might perform a specific function of interest. Database URL: http://papers.genomics.lbl.gov.
由于我们对蛋白质功能的了解有限,因此自动注释蛋白质功能容易出错。例如,通常不可能预测酶或转运蛋白的正确底物。此外,我们所拥有的关于蛋白质功能的大部分知识都缺失于基础数据库中。我们讨论了如何使用交互式工具快速查找与蛋白质功能相关的不同类型的信息。这些工具中的许多都可以通过 PaperBLAST(http://papers.genomics.lbl.gov)获得。结合这些工具,我们通常可以推断出蛋白质的功能。理想情况下,准确的注释可以让我们根据细菌的基因组序列来预测其功能,但实际上,这仍然具有挑战性。我们描述了一些交互式工具,这些工具可以从基因组序列中推断出潜在的功能,或者搜索基因组以找到可能执行特定感兴趣功能的蛋白质。数据库 URL:http://papers.genomics.lbl.gov。