Suppr超能文献

利用领域知识进行生物医学事件抽取的事件触发词识别。

Event trigger identification for biomedical events extraction using domain knowledge.

机构信息

School of Computer Science and Engineering, Key Laboratory of Computer Network and Information Integration, Ministry of Education, Southeast University, Nanjing 210096, China, and School of Engineering and Applied Science, Aston University, Birmingham B4 7ET, UK.

出版信息

Bioinformatics. 2014 Jun 1;30(11):1587-94. doi: 10.1093/bioinformatics/btu061. Epub 2014 Jan 30.

Abstract

MOTIVATION

In molecular biology, molecular events describe observable alterations of biomolecules, such as binding of proteins or RNA production. These events might be responsible for drug reactions or development of certain diseases. As such, biomedical event extraction, the process of automatically detecting description of molecular interactions in research articles, attracted substantial research interest recently. Event trigger identification, detecting the words describing the event types, is a crucial and prerequisite step in the pipeline process of biomedical event extraction. Taking the event types as classes, event trigger identification can be viewed as a classification task. For each word in a sentence, a trained classifier predicts whether the word corresponds to an event type and which event type based on the context features. Therefore, a well-designed feature set with a good level of discrimination and generalization is crucial for the performance of event trigger identification.

RESULTS

In this article, we propose a novel framework for event trigger identification. In particular, we learn biomedical domain knowledge from a large text corpus built from Medline and embed it into word features using neural language modeling. The embedded features are then combined with the syntactic and semantic context features using the multiple kernel learning method. The combined feature set is used for training the event trigger classifier. Experimental results on the golden standard corpus show that >2.5% improvement on F-score is achieved by the proposed framework when compared with the state-of-the-art approach, demonstrating the effectiveness of the proposed framework.

摘要

动机

在分子生物学中,分子事件描述了生物分子的可观察到的改变,如蛋白质的结合或 RNA 的产生。这些事件可能是药物反应或某些疾病发展的原因。因此,生物医学事件提取,即自动检测研究文章中分子相互作用描述的过程,最近引起了相当大的研究兴趣。事件触发词识别是生物医学事件提取管道过程中的关键且必备的步骤,它用于检测描述事件类型的词。将事件类型视为类别,事件触发词识别可以看作是一个分类任务。对于句子中的每个词,经过训练的分类器根据上下文特征预测该词是否对应于事件类型以及属于哪个事件类型。因此,具有良好区分度和泛化能力的精心设计的特征集对于事件触发词识别的性能至关重要。

结果

在本文中,我们提出了一种用于事件触发词识别的新框架。具体来说,我们从由 Medline 构建的大型文本语料库中学习生物医学领域知识,并使用神经语言模型将其嵌入到词特征中。然后,使用多核学习方法将嵌入的特征与句法和语义上下文特征相结合。使用组合特征集训练事件触发分类器。在黄金标准语料库上的实验结果表明,与最先进的方法相比,所提出的框架在 F 分数上提高了>2.5%,证明了该框架的有效性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验