AutoBind：从生物文献中自动提取蛋白质-配体结合亲和力数据。

AutoBind: automatic extraction of protein-ligand-binding affinity data from biological literature.

机构信息

Department of Electrical Engineering, Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 70101, Taiwan.

出版信息

Bioinformatics. 2012 Aug 15;28(16):2162-8. doi: 10.1093/bioinformatics/bts367. Epub 2012 Jul 2.

DOI:10.1093/bioinformatics/bts367

PMID:22753780

Abstract

MOTIVATION

Determination of the binding affinity of a protein-ligand complex is important to quantitatively specify whether a particular small molecule will bind to the target protein. Besides, collection of comprehensive datasets for protein-ligand complexes and their corresponding binding affinities is crucial in developing accurate scoring functions for the prediction of the binding affinities of previously unknown protein-ligand complexes. In the past decades, several databases of protein-ligand-binding affinities have been created via visual extraction from literature. However, such approaches are time-consuming and most of these databases are updated only a few times per year. Hence, there is an immediate demand for an automatic extraction method with high precision for binding affinity collection.

RESULT

We have created a new database of protein-ligand-binding affinity data, AutoBind, based on automatic information retrieval. We first compiled a collection of 1586 articles where the binding affinities have been marked manually. Based on this annotated collection, we designed four sentence patterns that are used to scan full-text articles as well as a scoring function to rank the sentences that match our patterns. The proposed sentence patterns can effectively identify the binding affinities in full-text articles. Our assessment shows that AutoBind achieved 84.22% precision and 79.07% recall on the testing corpus. Currently, 13 616 protein-ligand complexes and the corresponding binding affinities have been deposited in AutoBind from 17 221 articles.

AVAILABILITY

AutoBind is automatically updated on a monthly basis, and it is freely available at http://autobind.csie.ncku.edu.tw/ and http://autobind.mc.ntu.edu.tw/. All of the deposited binding affinities have been refined and approved manually before being released.

摘要

动机

确定蛋白质-配体复合物的结合亲和力对于定量说明特定小分子是否会与靶蛋白结合非常重要。此外，收集全面的蛋白质-配体复合物数据集及其相应的结合亲和力对于开发准确的评分函数以预测以前未知的蛋白质-配体复合物的结合亲和力至关重要。在过去的几十年中，已经通过从文献中进行视觉提取创建了几个蛋白质-配体结合亲和力数据库。然而，这种方法耗时且大多数数据库每年仅更新几次。因此，需要一种具有高精度的自动提取方法来收集结合亲和力。

结果

我们基于自动信息检索创建了一个新的蛋白质-配体结合亲和力数据库 AutoBind。我们首先编译了一个包含 1586 篇文章的集合，其中已经手动标记了结合亲和力。基于这个带注释的集合，我们设计了四个句子模式，用于扫描全文文章以及一个评分函数来对匹配我们模式的句子进行排名。所提出的句子模式可以有效地识别全文文章中的结合亲和力。我们的评估表明，AutoBind 在测试语料库上的精度达到 84.22%，召回率达到 79.07%。目前，已经从 17221 篇文章中向 AutoBind 中存入了 13616 个蛋白质-配体复合物及其相应的结合亲和力。

可用性

AutoBind 每月自动更新，可在 http://autobind.csie.ncku.edu.tw/ 和 http://autobind.mc.ntu.edu.tw/ 免费获得。所有存入的结合亲和力在发布之前都经过了手动精制和批准。

相似文献

AutoBind: automatic extraction of protein-ligand-binding affinity data from biological literature.

Bioinformatics. 2012 Aug 15;28(16):2162-8. doi: 10.1093/bioinformatics/bts367. Epub 2012 Jul 2.

Textpresso: an ontology-based information retrieval and extraction system for biological literature.

PLoS Biol. 2004 Nov;2(11):e309. doi: 10.1371/journal.pbio.0020309. Epub 2004 Sep 21.

Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature.

Bioinformatics. 2011 Feb 1;27(3):408-15. doi: 10.1093/bioinformatics/btq667. Epub 2010 Dec 7.

Challenges for automatically extracting molecular interactions from full-text articles.

BMC Bioinformatics. 2009 Sep 24;10:311. doi: 10.1186/1471-2105-10-311.

PDB-wide collection of binding data: current status of the PDBbind database.

Bioinformatics. 2015 Feb 1;31(3):405-12. doi: 10.1093/bioinformatics/btu626. Epub 2014 Oct 9.

BgN-Score and BsN-Score: bagging and boosting based ensemble neural networks scoring functions for accurate binding affinity prediction of protein-ligand complexes.

BMC Bioinformatics. 2015;16 Suppl 4(Suppl 4):S8. doi: 10.1186/1471-2105-16-S4-S8. Epub 2015 Feb 23.

A general approach for developing system-specific functions to score protein-ligand docked complexes using support vector inductive logic programming.

Proteins. 2007 Dec 1;69(4):823-31. doi: 10.1002/prot.21782.

An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.

BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S17. doi: 10.1186/1471-2105-6-S1-S17. Epub 2005 May 24.

BioDownloader: bioinformatics downloads and updates in a few clicks.

Bioinformatics. 2007 Jun 1;23(11):1437-9. doi: 10.1093/bioinformatics/btm120. Epub 2007 May 5.

Protemot: prediction of protein binding sites with automatically extracted geometrical templates.

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W303-9. doi: 10.1093/nar/gkl344.

引用本文的文献

Detection and categorization of bacteria habitats using shallow linguistic analysis.

BMC Bioinformatics. 2015;16 Suppl 10(Suppl 10):S5. doi: 10.1186/1471-2105-16-S10-S5. Epub 2015 Jul 13.

Quantum Mechanics Approaches to Drug Research in the Era of Structural Chemogenomics.

Int J Quantum Chem. 2013 Jun 15;113(12):1669-1675. doi: 10.1002/qua.24400.

Compound activity prediction using models of binding pockets or ligand properties in 3D.

Curr Top Med Chem. 2012;12(17):1869-82. doi: 10.2174/156802612804547335.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

AutoBind：从生物文献中自动提取蛋白质-配体结合亲和力数据。

AutoBind: automatic extraction of protein-ligand-binding affinity data from biological literature.

机构信息

Department of Electrical Engineering, Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 70101, Taiwan.

出版信息

Bioinformatics. 2012 Aug 15;28(16):2162-8. doi: 10.1093/bioinformatics/bts367. Epub 2012 Jul 2.

AutoBind：从生物文献中自动提取蛋白质-配体结合亲和力数据。

AutoBind: automatic extraction of protein-ligand-binding affinity data from biological literature.

机构信息

出版信息

MOTIVATION

RESULT

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

AutoBind：从生物文献中自动提取蛋白质-配体结合亲和力数据。

AutoBind: automatic extraction of protein-ligand-binding affinity data from biological literature.

机构信息

出版信息

MOTIVATION

RESULT

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献