Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1211 Geneva 4, Switzerland.
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac793.
To provide high quality, computationally tractable annotation of binding sites for biologically relevant (cognate) ligands in UniProtKB using the chemical ontology ChEBI (Chemical Entities of Biological Interest), to better support efforts to study and predict functionally relevant interactions between protein sequences and structures and small molecule ligands.
We structured the data model for cognate ligand binding site annotations in UniProtKB and performed a complete reannotation of all cognate ligand binding sites using stable unique identifiers from ChEBI, which we now use as the reference vocabulary for all such annotations. We developed improved search and query facilities for cognate ligands in the UniProt website, REST API and SPARQL endpoint that leverage the chemical structure data, nomenclature and classification that ChEBI provides.
Binding site annotations for cognate ligands described using ChEBI are available for UniProtKB protein sequence records in several formats (text, XML and RDF) and are freely available to query and download through the UniProt website (www.uniprot.org), REST API (www.uniprot.org/help/api), SPARQL endpoint (sparql.uniprot.org/) and FTP site (https://ftp.uniprot.org/pub/databases/uniprot/).
Supplementary data are available at Bioinformatics online.
使用化学本体 CHEBI(生物相关的化学实体)为 UniProtKB 中的生物相关(同源)配体提供高质量、计算上易于处理的结合位点注释,以更好地支持研究和预测蛋白质序列和结构与小分子配体之间功能相关相互作用的工作。
我们构建了 UniProtKB 中同源配体结合位点注释的数据模型,并使用 CHEBI 的稳定唯一标识符对所有同源配体结合位点进行了完整的重新注释,我们现在将其用作所有此类注释的参考词汇。我们开发了 UniProt 网站、REST API 和 SPARQL 端点中同源配体的改进搜索和查询功能,利用 CHEBI 提供的化学结构数据、命名法和分类。
使用 CHEBI 描述的同源配体结合位点注释可用于多种格式(文本、XML 和 RDF)的 UniProtKB 蛋白质序列记录,可通过 UniProt 网站(www.uniprot.org)、REST API(www.uniprot.org/help/api)、SPARQL 端点(sparql.uniprot.org/)和 FTP 站点(https://ftp.uniprot.org/pub/databases/uniprot/)查询和下载。
补充数据可在 Bioinformatics 在线获取。