文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

PreBIND和Textomy——使用支持向量机挖掘生物医学文献中的蛋白质-蛋白质相互作用。

PreBIND and Textomy--mining the biomedical literature for protein-protein interactions using a support vector machine.

作者信息

Donaldson Ian, Martin Joel, de Bruijn Berry, Wolting Cheryl, Lay Vicki, Tuekam Brigitte, Zhang Shudong, Baskin Berivan, Bader Gary D, Michalickova Katerina, Pawson Tony, Hogue Christopher W V

机构信息

Samuel Lunenfeld Research Institute, Toronto, M5G 1X5, Canada.

出版信息

BMC Bioinformatics. 2003 Mar 27;4:11. doi: 10.1186/1471-2105-4-11.


DOI:10.1186/1471-2105-4-11
PMID:12689350
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC153503/
Abstract

BACKGROUND: The majority of experimentally verified molecular interaction and biological pathway data are present in the unstructured text of biomedical journal articles where they are inaccessible to computational methods. The Biomolecular interaction network database (BIND) seeks to capture these data in a machine-readable format. We hypothesized that the formidable task-size of backfilling the database could be reduced by using Support Vector Machine technology to first locate interaction information in the literature. We present an information extraction system that was designed to locate protein-protein interaction data in the literature and present these data to curators and the public for review and entry into BIND. RESULTS: Cross-validation estimated the support vector machine's test-set precision, accuracy and recall for classifying abstracts describing interaction information was 92%, 90% and 92% respectively. We estimated that the system would be able to recall up to 60% of all non-high throughput interactions present in another yeast-protein interaction database. Finally, this system was applied to a real-world curation problem and its use was found to reduce the task duration by 70% thus saving 176 days. CONCLUSIONS: Machine learning methods are useful as tools to direct interaction and pathway database back-filling; however, this potential can only be realized if these techniques are coupled with human review and entry into a factual database such as BIND. The PreBIND system described here is available to the public at http://bind.ca. Current capabilities allow searching for human, mouse and yeast protein-interaction information.

摘要

背景:大多数经过实验验证的分子相互作用和生物途径数据存在于生物医学期刊文章的非结构化文本中,计算方法无法获取这些数据。生物分子相互作用网络数据库(BIND)旨在以机器可读格式捕获这些数据。我们假设,通过使用支持向量机技术首先在文献中定位相互作用信息,可以减少数据库回填这一艰巨任务的规模。我们提出了一个信息提取系统,该系统旨在在文献中定位蛋白质-蛋白质相互作用数据,并将这些数据呈现给编辑人员和公众以供审核并录入BIND。 结果:交叉验证估计,支持向量机对描述相互作用信息的摘要进行分类时,测试集的精确率、准确率和召回率分别为92%、90%和92%。我们估计该系统能够召回另一个酵母-蛋白质相互作用数据库中所有非高通量相互作用的60%。最后,该系统应用于一个实际的编目问题,发现其使用可将任务持续时间减少70%,从而节省176天。 结论:机器学习方法作为指导相互作用和途径数据库回填的工具很有用;然而,只有将这些技术与人工审核相结合并录入诸如BIND这样的事实数据库,才能实现这种潜力。此处描述的PreBIND系统可在http://bind.ca上向公众开放。当前功能允许搜索人类、小鼠和酵母的蛋白质相互作用信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/225d/153503/485ba1d8e5c0/1471-2105-4-11-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/225d/153503/e8b126add176/1471-2105-4-11-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/225d/153503/4ee60b3a0d9b/1471-2105-4-11-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/225d/153503/6abd525c35cb/1471-2105-4-11-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/225d/153503/485ba1d8e5c0/1471-2105-4-11-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/225d/153503/e8b126add176/1471-2105-4-11-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/225d/153503/4ee60b3a0d9b/1471-2105-4-11-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/225d/153503/6abd525c35cb/1471-2105-4-11-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/225d/153503/485ba1d8e5c0/1471-2105-4-11-4.jpg

相似文献

[1]
PreBIND and Textomy--mining the biomedical literature for protein-protein interactions using a support vector machine.

BMC Bioinformatics. 2003-3-27

[2]
Assisted curation: does text mining really help?

Pac Symp Biocomput. 2008

[3]
BioPPISVMExtractor: a protein-protein interaction extractor for biomedical literature using SVM and rich feature sets.

J Biomed Inform. 2009-8-23

[4]
Overview of the protein-protein interaction annotation extraction task of BioCreative II.

Genome Biol. 2008

[5]
Supporting the curation of biological databases with reusable text mining.

Genome Inform. 2005

[6]
Mining and analysing scale-free protein-protein interaction network.

Int J Bioinform Res Appl. 2005

[7]
A mouse protein interactome through combined literature mining with multiple sources of interaction evidence.

Amino Acids. 2009-8-8

[8]
Data mining and predictive modeling of biomolecular network from biomedical literature databases.

IEEE/ACM Trans Comput Biol Bioinform. 2007

[9]
Extracting human protein interactions from MEDLINE using a full-sentence parser.

Bioinformatics. 2004-3-22

[10]
AVID: an integrative framework for discovering functional relationships among proteins.

BMC Bioinformatics. 2005-6-1

引用本文的文献

[1]
Biomedical Text Classification Using Augmented Word Representation Based on Distributional and Relational Contexts.

Comput Intell Neurosci. 2023-2-15

[2]
Recent advances in biomedical literature mining.

Brief Bioinform. 2021-5-20

[3]
Multitask learning for biomedical named entity recognition with cross-sharing structure.

BMC Bioinformatics. 2019-8-16

[4]
Triage by ranking to support the curation of protein interactions.

Database (Oxford). 2017-1-1

[5]
Using uncertainty to link and rank evidence from biomedical literature for model curation.

Bioinformatics. 2017-12-1

[6]
Biocuration with insufficient resources and fixed timelines.

Database (Oxford). 2015-12-26

[7]
Text Mining for Protein Docking.

PLoS Comput Biol. 2015-12-9

[8]
Representing and extracting lung cancer study metadata: study objective and study design.

Comput Biol Med. 2015-3

[9]
Text-mining-assisted biocuration workflows in Argo.

Database (Oxford). 2014-7-18

[10]
Computational Prediction of Protein-Protein Interaction Networks: Algo-rithms and Resources.

Curr Genomics. 2013-9

本文引用的文献

[1]
The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.

Nucleic Acids Res. 2003-1-1

[2]
SeqHound: biological sequence and structure database as a platform for bioinformatics research.

BMC Bioinformatics. 2002-10-25

[3]
Analyzing yeast protein-protein interaction data obtained from different sources.

Nat Biotechnol. 2002-10

[4]
Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry.

Nature. 2002-1-10

[5]
The potential use of SUISEKI as a protein interaction discovery tool.

Genome Inform. 2001

[6]
Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO).

Nucleic Acids Res. 2002-1-1

[7]
MIPS: a database for genomes and protein sequences.

Nucleic Acids Res. 2002-1-1

[8]
Database resources of the National Center for Biotechnology Information: 2002 update.

Nucleic Acids Res. 2002-1-1

[9]
GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles.

Bioinformatics. 2001

[10]
The NCBI data model.

Methods Biochem Anal. 2001

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索