• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于神经网络的语义角色标注在从生物医学文本中自动提取关系方面的大规模应用。

Large scale application of neural network based semantic role labeling for automated relation extraction from biomedical texts.

作者信息

Barnickel Thorsten, Weston Jason, Collobert Ronan, Mewes Hans-Werner, Stümpflen Volker

机构信息

Helmholtz Zentrum München, Institute of Bioinformatics and Systems Biology (MIPS), Neuherberg, Germany.

出版信息

PLoS One. 2009 Jul 28;4(7):e6393. doi: 10.1371/journal.pone.0006393.

DOI:10.1371/journal.pone.0006393
PMID:19636432
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2712690/
Abstract

To reduce the increasing amount of time spent on literature search in the life sciences, several methods for automated knowledge extraction have been developed. Co-occurrence based approaches can deal with large text corpora like MEDLINE in an acceptable time but are not able to extract any specific type of semantic relation. Semantic relation extraction methods based on syntax trees, on the other hand, are computationally expensive and the interpretation of the generated trees is difficult. Several natural language processing (NLP) approaches for the biomedical domain exist focusing specifically on the detection of a limited set of relation types. For systems biology, generic approaches for the detection of a multitude of relation types which in addition are able to process large text corpora are needed but the number of systems meeting both requirements is very limited. We introduce the use of SENNA ("Semantic Extraction using a Neural Network Architecture"), a fast and accurate neural network based Semantic Role Labeling (SRL) program, for the large scale extraction of semantic relations from the biomedical literature. A comparison of processing times of SENNA and other SRL systems or syntactical parsers used in the biomedical domain revealed that SENNA is the fastest Proposition Bank (PropBank) conforming SRL program currently available. 89 million biomedical sentences were tagged with SENNA on a 100 node cluster within three days. The accuracy of the presented relation extraction approach was evaluated on two test sets of annotated sentences resulting in precision/recall values of 0.71/0.43. We show that the accuracy as well as processing speed of the proposed semantic relation extraction approach is sufficient for its large scale application on biomedical text. The proposed approach is highly generalizable regarding the supported relation types and appears to be especially suited for general-purpose, broad-scale text mining systems. The presented approach bridges the gap between fast, co-occurrence-based approaches lacking semantic relations and highly specialized and computationally demanding NLP approaches.

摘要

为减少生命科学领域文献检索所花费的时间不断增加的问题,已开发出几种自动知识提取方法。基于共现的方法能够在可接受的时间内处理像MEDLINE这样的大型文本语料库,但无法提取任何特定类型的语义关系。另一方面,基于句法树的语义关系提取方法计算成本高昂,且对生成的树进行解释也很困难。存在几种针对生物医学领域的自然语言处理(NLP)方法,专门侧重于检测有限的一组关系类型。对于系统生物学而言,需要能够检测多种关系类型且还能处理大型文本语料库的通用方法,但同时满足这两个要求的系统数量非常有限。我们引入使用SENNA(“使用神经网络架构进行语义提取”),这是一个基于快速且准确的神经网络的语义角色标注(SRL)程序,用于从生物医学文献中大规模提取语义关系。对SENNA与生物医学领域中使用的其他SRL系统或句法解析器的处理时间进行比较后发现,SENNA是目前可用的最快的符合命题库(PropBank)的SRL程序。在一个100节点的集群上,三天内用SENNA对8900万个生物医学句子进行了标注。在所呈现的关系提取方法的准确性在两个带注释句子的测试集上进行了评估,精确率/召回率值为0.71/0.43。我们表明,所提出的语义关系提取方法的准确性和处理速度足以使其在生物医学文本上进行大规模应用。所提出的方法在支持的关系类型方面具有高度的通用性,并且似乎特别适合通用的、大规模的文本挖掘系统。所呈现的方法弥合了缺乏语义关系的基于快速共现的方法与高度专业化且计算要求高的NLP方法之间的差距。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a592/2712690/925113fb980d/pone.0006393.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a592/2712690/b3bfc798f687/pone.0006393.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a592/2712690/953d1489beaa/pone.0006393.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a592/2712690/925113fb980d/pone.0006393.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a592/2712690/b3bfc798f687/pone.0006393.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a592/2712690/953d1489beaa/pone.0006393.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a592/2712690/925113fb980d/pone.0006393.g003.jpg

相似文献

1
Large scale application of neural network based semantic role labeling for automated relation extraction from biomedical texts.基于神经网络的语义角色标注在从生物医学文本中自动提取关系方面的大规模应用。
PLoS One. 2009 Jul 28;4(7):e6393. doi: 10.1371/journal.pone.0006393.
2
BIOSMILE: a semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features.BIOSMILE:一种用于生物医学动词的语义角色标注系统,它使用带有自动生成模板特征的最大熵模型。
BMC Bioinformatics. 2007 Sep 1;8:325. doi: 10.1186/1471-2105-8-325.
3
Enhancing biomedical text summarization using semantic relation extraction.利用语义关系抽取技术增强生物医学文本摘要
PLoS One. 2011;6(8):e23862. doi: 10.1371/journal.pone.0023862. Epub 2011 Aug 26.
4
Clinical Context-Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation.基于深度神经网络的临床相关生物医学文本摘要:模型开发与验证。
J Med Internet Res. 2020 Oct 23;22(10):e19810. doi: 10.2196/19810.
5
Extraction of semantic biomedical relations from text using conditional random fields.使用条件随机场从文本中提取语义生物医学关系。
BMC Bioinformatics. 2008 Apr 23;9:207. doi: 10.1186/1471-2105-9-207.
6
Biomedical question answering using semantic relations.基于语义关系的生物医学问答
BMC Bioinformatics. 2015 Jan 16;16(1):6. doi: 10.1186/s12859-014-0365-3.
7
Domain adaptation for semantic role labeling of clinical text.临床文本语义角色标注的领域适应
J Am Med Inform Assoc. 2015 Sep;22(5):967-79. doi: 10.1093/jamia/ocu048. Epub 2015 Jun 10.
8
Semantic biomedical resource discovery: a Natural Language Processing framework.语义生物医学资源发现:一种自然语言处理框架。
BMC Med Inform Decis Mak. 2015 Sep 30;15:77. doi: 10.1186/s12911-015-0200-4.
9
Bio-semantic relation extraction with attention-based external knowledge reinforcement.基于注意力的外部知识强化的生物语义关系抽取。
BMC Bioinformatics. 2020 May 24;21(1):213. doi: 10.1186/s12859-020-3540-8.
10
Domain adaptation for semantic role labeling in the biomedical domain.生物医学领域的语义角色标注的领域自适应。
Bioinformatics. 2010 Apr 15;26(8):1098-104. doi: 10.1093/bioinformatics/btq075. Epub 2010 Feb 23.

引用本文的文献

1
A Neural Network Reveals Motoric Effects of Maternal Preconception Exposure to Nicotine on Rat Pup Behavior: A New Approach for Movement Disorders Diagnosis.神经网络揭示孕前母体接触尼古丁对幼鼠行为的运动效应:一种运动障碍诊断的新方法。
Front Neurosci. 2021 Jul 20;15:686767. doi: 10.3389/fnins.2021.686767. eCollection 2021.
2
KGen: a knowledge graph generator from biomedical scientific literature.KGen:一种从生物医学科学文献中生成知识图谱的工具。
BMC Med Inform Decis Mak. 2020 Dec 14;20(Suppl 4):314. doi: 10.1186/s12911-020-01341-5.
3
Exploiting sequence labeling framework to extract document-level relations from biomedical texts.

本文引用的文献

1
Semantic role labeling for protein transport predicates.蛋白质转运谓词的语义角色标注。
BMC Bioinformatics. 2008 Jun 11;9:277. doi: 10.1186/1471-2105-9-277.
2
Extraction of protein interaction data: a comparative analysis of methods in use.蛋白质相互作用数据的提取:对现有方法的比较分析
EURASIP J Bioinform Syst Biol. 2007;2007(1):53096. doi: 10.1155/2007/53096.
3
BIOSMILE: a semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features.BIOSMILE:一种用于生物医学动词的语义角色标注系统,它使用带有自动生成模板特征的最大熵模型。
利用序列标注框架从生物医学文本中提取文档级关系。
BMC Bioinformatics. 2020 Mar 27;21(1):125. doi: 10.1186/s12859-020-3457-2.
4
Will the future of knowledge work automation transform personalized medicine?知识工作自动化的未来会改变个性化医疗吗?
Appl Transl Genom. 2014 Jun 10;3(3):50-3. doi: 10.1016/j.atg.2014.05.003. eCollection 2014 Sep 1.
5
Stroma-Derived Connective Tissue Growth Factor Maintains Cell Cycle Progression and Repopulation Activity of Hematopoietic Stem Cells In Vitro.基质衍生的结缔组织生长因子维持造血干细胞的细胞周期进程和体外增殖活性。
Stem Cell Reports. 2015 Nov 10;5(5):702-715. doi: 10.1016/j.stemcr.2015.09.018. Epub 2015 Oct 29.
6
Extracting relations from traditional Chinese medicine literature via heterogeneous entity networks.通过异构实体网络从中医文献中提取关系
J Am Med Inform Assoc. 2016 Mar;23(2):356-65. doi: 10.1093/jamia/ocv092. Epub 2015 Jul 29.
7
Domain adaptation for semantic role labeling of clinical text.临床文本语义角色标注的领域适应
J Am Med Inform Assoc. 2015 Sep;22(5):967-79. doi: 10.1093/jamia/ocu048. Epub 2015 Jun 10.
8
Automating case definitions using literature-based reasoning.使用基于文献的推理自动生成病例定义。
Appl Clin Inform. 2013 Oct 30;4(4):515-27. doi: 10.4338/ACI-2013-04-RA-0028. eCollection 2013.
9
Negatome 2.0: a database of non-interacting proteins derived by literature mining, manual annotation and protein structure analysis.Negatome 2.0:一个通过文献挖掘、手动注释和蛋白质结构分析得到的非相互作用蛋白质数据库。
Nucleic Acids Res. 2014 Jan;42(Database issue):D396-400. doi: 10.1093/nar/gkt1079. Epub 2013 Nov 8.
10
Recurrent temporal networks and language acquisition-from corticostriatal neurophysiology to reservoir computing.递归颞叶网络与语言习得:从皮质纹状体神经生理学到储层计算。
Front Psychol. 2013 Aug 5;4:500. doi: 10.3389/fpsyg.2013.00500. eCollection 2013.
BMC Bioinformatics. 2007 Sep 1;8:325. doi: 10.1186/1471-2105-8-325.
4
Benchmarking natural-language parsers for biological applications using dependency graphs.使用依存关系图对生物应用中的自然语言解析器进行基准测试。
BMC Bioinformatics. 2007 Jan 25;8:24. doi: 10.1186/1471-2105-8-24.
5
EBIMed--text crunching to gather facts for proteins from Medline.EBIMed——通过文本处理从医学在线数据库中收集蛋白质相关事实。
Bioinformatics. 2007 Jan 15;23(2):e237-44. doi: 10.1093/bioinformatics/btl302.
6
RelEx--relation extraction using dependency parse trees.RelEx——使用依存句法分析树进行关系抽取。
Bioinformatics. 2007 Feb 1;23(3):365-71. doi: 10.1093/bioinformatics/btl616. Epub 2006 Dec 1.
7
Towards semantic role labeling & IE in the medical literature.迈向医学文献中的语义角色标注与信息抽取
AMIA Annu Symp Proc. 2005;2005:410-4.
8
Overview of BioCreAtIvE: critical assessment of information extraction for biology.生物创意(BioCreAtIvE)概述:生物学信息提取的批判性评估
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2105-6-S1-S1. Epub 2005 May 24.
9
A gene network for navigating the literature.一个用于浏览文献的基因网络。
Nat Genet. 2004 Jul;36(7):664. doi: 10.1038/ng0704-664.
10
PreBIND and Textomy--mining the biomedical literature for protein-protein interactions using a support vector machine.PreBIND和Textomy——使用支持向量机挖掘生物医学文献中的蛋白质-蛋白质相互作用。
BMC Bioinformatics. 2003 Mar 27;4:11. doi: 10.1186/1471-2105-4-11.