• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种从生物医学文献中提取关系的混合方法。

A hybrid method for relation extraction from biomedical literature.

作者信息

Huang Minlie, Zhu Xiaoyan, Li Ming

机构信息

State Key Laboratory of Intelligent Technology and Systems (LITS), Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China.

出版信息

Int J Med Inform. 2006 Jun;75(6):443-55. doi: 10.1016/j.ijmedinf.2005.06.010. Epub 2005 Aug 10.

DOI:10.1016/j.ijmedinf.2005.06.010
PMID:16095962
Abstract

PURPOSE

Over recent years, there has been a growing interest in extracting entities and relations from biomedical literature. There are a vast number of systems and approaches being proposed to extract biological relations, but none of them achieves satisfactory results. These methodologies are either parsing-based or pattern-based, which are not competent to handle the grammatical complexities of biomedical texts, or too complicated to be adapted. It is well known that appositive, coordinative propositions and such grammatical structures are extremely common in biomedical texts, particularly in full texts. However, these problems are still untouched for most of researchers.

METHODS

In this paper, we have proposed a new approach, which is hybrid with both shallow parsing and pattern matching, to extract relations between proteins from scientific papers of biomedical themes. In the method, appositive and coordinative structures are interpreted based on the shallow parsing analysis, with both syntactic and semantic constraints. Then long sentences are splitted into sub-ones, from which relations are extracted by a greedy pattern matching algorithm, along with automatically generated patterns.

RESULTS

Our approach is experimented to extract protein-protein interactions from full biomedical texts, and has achieved an average F-score of 80% on individual verbs, and 66% on all verbs. With the help of shallow parsing analysis, pattern matching is improved remarkably. Compared with the traditional pattern matching algorithm, our approach achieves about 7% improvement of both precision and F-score. In contrast to other systems, our approach achieves performance comparable to the best. A demo system has been available at http://spies.cs.tsinghua.edu.cn.

摘要

目的

近年来,从生物医学文献中提取实体和关系的研究兴趣日益浓厚。目前已有大量用于提取生物关系的系统和方法被提出,但均未取得令人满意的结果。这些方法要么基于句法分析,要么基于模式匹配,无法处理生物医学文本的语法复杂性,或者过于复杂难以应用。众所周知,同位语、并列结构等语法结构在生物医学文本中极为常见,尤其是在全文中。然而,大多数研究人员仍未触及这些问题。

方法

在本文中,我们提出了一种新方法,它将浅层句法分析和模式匹配相结合,用于从生物医学主题的科学论文中提取蛋白质之间的关系。该方法基于浅层句法分析,结合句法和语义约束来解释同位语和并列结构。然后将长句子拆分为短句子,通过贪心模式匹配算法以及自动生成的模式从中提取关系。

结果

我们的方法经过实验,用于从完整的生物医学文本中提取蛋白质 - 蛋白质相互作用,在单个动词上平均F值达到80%,在所有动词上达到66%。借助浅层句法分析,模式匹配得到了显著改进。与传统模式匹配算法相比,我们的方法在精确率和F值上均提高了约7%。与其他系统相比,我们的方法性能与最佳系统相当。一个演示系统可在http://spies.cs.tsinghua.edu.cn获取。

相似文献

1
A hybrid method for relation extraction from biomedical literature.一种从生物医学文献中提取关系的混合方法。
Int J Med Inform. 2006 Jun;75(6):443-55. doi: 10.1016/j.ijmedinf.2005.06.010. Epub 2005 Aug 10.
2
Recognizing names in biomedical texts: a machine learning approach.识别生物医学文本中的名称:一种机器学习方法。
Bioinformatics. 2004 May 1;20(7):1178-90. doi: 10.1093/bioinformatics/bth060. Epub 2004 Feb 10.
3
Recognizing names in biomedical texts using mutual information independence model and SVM plus sigmoid.使用互信息独立性模型和支持向量机加 sigmoid 函数识别生物医学文本中的名称。
Int J Med Inform. 2006 Jun;75(6):456-67. doi: 10.1016/j.ijmedinf.2005.06.012. Epub 2005 Aug 19.
4
Discovering patterns to extract protein-protein interactions from full texts.从全文中发现提取蛋白质-蛋白质相互作用的模式。
Bioinformatics. 2004 Dec 12;20(18):3604-12. doi: 10.1093/bioinformatics/bth451. Epub 2004 Jul 29.
5
Discovering patterns to extract protein-protein interactions from the literature: Part II.从文献中发现用于提取蛋白质-蛋白质相互作用的模式:第二部分。
Bioinformatics. 2005 Aug 1;21(15):3294-300. doi: 10.1093/bioinformatics/bti493. Epub 2005 May 12.
6
Distributed modules for text annotation and IE applied to the biomedical domain.应用于生物医学领域的文本注释和信息提取的分布式模块。
Int J Med Inform. 2006 Jun;75(6):496-500. doi: 10.1016/j.ijmedinf.2005.06.011. Epub 2005 Aug 8.
7
Zone analysis in biology articles as a basis for information extraction.生物学文章中的区域分析作为信息提取的基础。
Int J Med Inform. 2006 Jun;75(6):468-87. doi: 10.1016/j.ijmedinf.2005.06.013. Epub 2005 Aug 19.
8
The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text.自然语言处理中领域知识与语言结构的相互作用:解读生物医学文本中的上位命题
J Biomed Inform. 2003 Dec;36(6):462-77. doi: 10.1016/j.jbi.2003.11.003.
9
Gene symbol disambiguation using knowledge-based profiles.使用基于知识的概况进行基因符号消歧。
Bioinformatics. 2007 Apr 15;23(8):1015-22. doi: 10.1093/bioinformatics/btm056. Epub 2007 Feb 21.
10
Evaluation of two dependency parsers on biomedical corpus targeted at protein-protein interactions.针对蛋白质-蛋白质相互作用的生物医学语料库对两种依存句法分析器的评估。
Int J Med Inform. 2006 Jun;75(6):430-42. doi: 10.1016/j.ijmedinf.2005.06.009. Epub 2005 Aug 11.

引用本文的文献

1
A Relation Extraction Framework for Biomedical Text Using Hybrid Feature Set.一种使用混合特征集的生物医学文本关系提取框架。
Comput Math Methods Med. 2015;2015:910423. doi: 10.1155/2015/910423. Epub 2015 Aug 10.
2
iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system.生物医学领域标准格式中的iSimp:增强句子简化系统的互操作性
Database (Oxford). 2014 May 21;2014. doi: 10.1093/database/bau038. Print 2014.
3
Knowledge-based extraction of adverse drug events from biomedical text.基于知识的生物医学文本中不良药物事件的提取。
BMC Bioinformatics. 2014 Mar 4;15:64. doi: 10.1186/1471-2105-15-64.
4
A linguistic rule-based approach to extract drug-drug interactions from pharmacological documents.基于语言规则的方法从药理学文献中提取药物-药物相互作用。
BMC Bioinformatics. 2011 Mar 29;12 Suppl 2(Suppl 2):S1. doi: 10.1186/1471-2105-12-S2-S1.
5
Extracting causal relations on HIV drug resistance from literature.从文献中提取 HIV 耐药性的因果关系。
BMC Bioinformatics. 2010 Feb 23;11:101. doi: 10.1186/1471-2105-11-101.
6
Large-scale directional relationship extraction and resolution.大规模方向关系提取与解析。
BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S11. doi: 10.1186/1471-2105-9-S9-S11.