使用 ChEBI 对 UniProtKB 中的生物相关配体进行注释。

Annotation of biologically relevant ligands in UniProtKB using ChEBI.

机构信息

Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1211 Geneva 4, Switzerland.

出版信息

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac793.

DOI:10.1093/bioinformatics/btac793

PMID:36484697

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9825770/

Abstract

MOTIVATION

To provide high quality, computationally tractable annotation of binding sites for biologically relevant (cognate) ligands in UniProtKB using the chemical ontology ChEBI (Chemical Entities of Biological Interest), to better support efforts to study and predict functionally relevant interactions between protein sequences and structures and small molecule ligands.

RESULTS

We structured the data model for cognate ligand binding site annotations in UniProtKB and performed a complete reannotation of all cognate ligand binding sites using stable unique identifiers from ChEBI, which we now use as the reference vocabulary for all such annotations. We developed improved search and query facilities for cognate ligands in the UniProt website, REST API and SPARQL endpoint that leverage the chemical structure data, nomenclature and classification that ChEBI provides.

AVAILABILITY AND IMPLEMENTATION

Binding site annotations for cognate ligands described using ChEBI are available for UniProtKB protein sequence records in several formats (text, XML and RDF) and are freely available to query and download through the UniProt website (www.uniprot.org), REST API (www.uniprot.org/help/api), SPARQL endpoint (sparql.uniprot.org/) and FTP site (https://ftp.uniprot.org/pub/databases/uniprot/).

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

使用化学本体 CHEBI（生物相关的化学实体）为 UniProtKB 中的生物相关（同源）配体提供高质量、计算上易于处理的结合位点注释，以更好地支持研究和预测蛋白质序列和结构与小分子配体之间功能相关相互作用的工作。

结果

我们构建了 UniProtKB 中同源配体结合位点注释的数据模型，并使用 CHEBI 的稳定唯一标识符对所有同源配体结合位点进行了完整的重新注释，我们现在将其用作所有此类注释的参考词汇。我们开发了 UniProt 网站、REST API 和 SPARQL 端点中同源配体的改进搜索和查询功能，利用 CHEBI 提供的化学结构数据、命名法和分类。

可用性和实现

使用 CHEBI 描述的同源配体结合位点注释可用于多种格式（文本、XML 和 RDF）的 UniProtKB 蛋白质序列记录，可通过 UniProt 网站（www.uniprot.org）、REST API（www.uniprot.org/help/api）、SPARQL 端点（sparql.uniprot.org/）和 FTP 站点（https://ftp.uniprot.org/pub/databases/uniprot/）查询和下载。

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9793/9825770/31066273f90e/btac793f1.jpg

相似文献

Annotation of biologically relevant ligands in UniProtKB using ChEBI.使用 ChEBI 对 UniProtKB 中的生物相关配体进行注释。

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac793.

Enzyme annotation in UniProtKB using Rhea.使用 Rhea 在 UniProtKB 中进行酶注释。

Bioinformatics. 2020 Mar 1;36(6):1896-1901. doi: 10.1093/bioinformatics/btz817.

Diverse Taxonomies for Diverse Chemistries: Enhanced Representation of Natural Product Metabolism in UniProtKB.针对不同化学性质的多样分类法：在UniProtKB中增强天然产物代谢的表征

Metabolites. 2021 Jan 12;11(1):48. doi: 10.3390/metabo11010048.

Updates in Rhea: SPARQLing biochemical reaction data.Rhea 更新：对生物化学反应数据进行 SPARQL 操作。

Nucleic Acids Res. 2019 Jan 8;47(D1):D596-D600. doi: 10.1093/nar/gky876.

The Universal Protein Resource (UniProt): an expanding universe of protein information.通用蛋白质资源（UniProt）：不断扩展的蛋白质信息宇宙。

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D187-91. doi: 10.1093/nar/gkj161.

UniProtKB/Swiss-Prot.通用蛋白质知识库/瑞士蛋白质数据库

Methods Mol Biol. 2007;406:89-112. doi: 10.1007/978-1-59745-535-0_4.

UniRule: a unified rule resource for automatic annotation in the UniProt Knowledgebase.UniRule：UniProt 知识库中自动注释的统一规则资源。

Bioinformatics. 2020 Nov 1;36(17):4643-4648. doi: 10.1093/bioinformatics/btaa485.

UniProt-DAAC: domain architecture alignment and classification, a new method for automatic functional annotation in UniProtKB.UniProt-DAAC：结构域架构比对与分类，一种在UniProtKB中进行自动功能注释的新方法。

Bioinformatics. 2016 Aug 1;32(15):2264-71. doi: 10.1093/bioinformatics/btw114. Epub 2016 Mar 7.

UniRef: comprehensive and non-redundant UniProt reference clusters.UniRef：全面且无冗余的UniProt参考簇。

Bioinformatics. 2007 May 15;23(10):1282-8. doi: 10.1093/bioinformatics/btm098. Epub 2007 Mar 22.

UniProt Knowledgebase: a hub of integrated protein data.UniProt 知识库：一个集成蛋白质数据的中心。

Database (Oxford). 2011 Mar 29;2011:bar009. doi: 10.1093/database/bar009. Print 2011.

引用本文的文献

Aqueous Dispersion of Unmodified Fullerene C60: Stimulation of Hair Growth and Study of a New Molecular Target for Interaction.未修饰的富勒烯C60的水分散体：对毛发生长的刺激作用及相互作用新分子靶点的研究

Int J Mol Sci. 2025 Sep 2;26(17):8517. doi: 10.3390/ijms26178517.

Plastic-Microbial BioRemediation DB: A Curated Database for Multi-Omics Applications.塑料-微生物生物修复数据库：一个用于多组学应用的精选数据库。

Environ Microbiol Rep. 2025 Oct;17(5):e70178. doi: 10.1111/1758-2229.70178.

GnomAD Missense Variants of Uncertain Significance: Implications for p53 Stability and Phosphorylation.意义不确定的GnomAD错义变体：对p53稳定性和磷酸化的影响

Int J Mol Sci. 2025 Aug 1;26(15):7455. doi: 10.3390/ijms26157455.

Protein Language Model Identifies Disordered, Conserved Motifs Implicated in Phase Separation.蛋白质语言模型识别出与相分离相关的无序保守基序。

bioRxiv. 2025 Jul 23:2024.12.12.628175. doi: 10.1101/2024.12.12.628175.

A systems biology approach reveals dual neurotherapeutic mechanisms of Dioscorea bulbifera in Alzheimer's disease via estrogen signaling and cholinergic modulation.一种系统生物学方法揭示了黄独通过雌激素信号传导和胆碱能调节在阿尔茨海默病中的双重神经治疗机制。

Inflammopharmacology. 2025 Sep;33(9):5483-5508. doi: 10.1007/s10787-025-01872-1. Epub 2025 Aug 4.

Design of highly functional genome editors by modelling CRISPR-Cas sequences.通过对CRISPR-Cas序列进行建模设计高功能基因组编辑器。

Nature. 2025 Jul 30. doi: 10.1038/s41586-025-09298-z.

Revealing the Improving Effect and Molecular Mechanism of -Clausenamide in Combating the Acute Lung Injury: Insights from Network Pharmacology, Molecular Docking, and In Vitro Validation.揭示黄皮酰胺抗急性肺损伤的改善作用及分子机制：基于网络药理学、分子对接和体外验证的见解

Biology (Basel). 2025 Jul 9;14(7):836. doi: 10.3390/biology14070836.

Protein2Text: Resampling Mechanism to Translate Protein Sequences into Human-Interpretable Text.Protein2Text：将蛋白质序列翻译成人类可理解文本的重采样机制。

Proc Conf. 2025 Apr;2025:918-937. doi: 10.18653/v1/2025.naacl-industry.68.

Analysis of metagenomic data.宏基因组数据的分析

Nat Rev Methods Primers. 2025;5. doi: 10.1038/s43586-024-00376-6. Epub 2025 Jan 23.

Prognostic Modeling of Deleterious IDUA Mutations L238Q and P385R in Hurler Syndrome Through Molecular Dynamics Simulations.通过分子动力学模拟对黏多糖贮积症I型（Hurler综合征）中有害的艾杜糖醛酸酶（IDUA）突变L238Q和P385R进行预后建模。

Pharmaceuticals (Basel). 2025 Jun 19;18(6):922. doi: 10.3390/ph18060922.

本文引用的文献

AlphaFill: enriching AlphaFold models with ligands and cofactors.AlphaFill：利用配体和辅因子丰富 AlphaFold 模型。

Nat Methods. 2023 Feb;20(2):205-213. doi: 10.1038/s41592-022-01685-y. Epub 2022 Nov 24.

Effect on intrinsic peroxidase activity of substituting coevolved residues from Ω-loop C of human cytochrome c into yeast iso-1-cytochrome c.取代人细胞色素 c 的 Ω 环 C 中的共进化残基对酵母同工酶 1-细胞色素 c 内源性过氧化物酶活性的影响。

J Inorg Biochem. 2022 Jul;232:111819. doi: 10.1016/j.jinorgbio.2022.111819. Epub 2022 Apr 6.

ECO: the Evidence and Conclusion Ontology, an update for 2022.ECO：证据和结论本体论，2022 年更新。

Nucleic Acids Res. 2022 Jan 7;50(D1):D1515-D1521. doi: 10.1093/nar/gkab1025.

Protein embeddings and deep learning predict binding residues for various ligand classes.蛋白质嵌入和深度学习预测各种配体类的结合残基。

Sci Rep. 2021 Dec 13;11(1):23916. doi: 10.1038/s41598-021-03431-4.

Identification of Iron-Sulfur (Fe-S) Cluster and Zinc (Zn) Binding Sites Within Proteomes Predicted by DeepMind's AlphaFold2 Program Dramatically Expands the Metalloproteome.通过DeepMind的AlphaFold2程序预测蛋白质组中铁硫（Fe-S）簇和锌（Zn）结合位点，极大地扩展了金属蛋白质组。

J Mol Biol. 2022 Jan 30;434(2):167377. doi: 10.1016/j.jmb.2021.167377. Epub 2021 Nov 24.

AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models.AlphaFold 蛋白质结构数据库：用高精度模型极大地扩展蛋白质序列空间的结构覆盖范围。

Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444. doi: 10.1093/nar/gkab1061.

Rhea, the reaction knowledgebase in 2022.Rhea，2022 年的反应知识库。

Nucleic Acids Res. 2022 Jan 7;50(D1):D693-D700. doi: 10.1093/nar/gkab1016.

Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。

Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.

Online Mendelian Inheritance in Man (OMIM®): Victor McKusick's magnum opus.在线孟德尔人类遗传数据库 (OMIM®)：维克托·麦克库斯基克的巨著。

Am J Med Genet A. 2021 Nov;185(11):3259-3265. doi: 10.1002/ajmg.a.62407. Epub 2021 Jun 24.

IDSM ChemWebRDF: SPARQLing small-molecule datasets.IDSM化学网络资源描述框架：对小分子数据集进行SPARQL查询

J Cheminform. 2021 May 12;13(1):38. doi: 10.1186/s13321-021-00515-1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用 ChEBI 对 UniProtKB 中的生物相关配体进行注释。

Annotation of biologically relevant ligands in UniProtKB using ChEBI.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献