• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

联合模型在生物医学事件抽取中的应用。

Combining joint models for biomedical event extraction.

机构信息

Department of Computer Science, Stanford University, Stanford, CA, USA.

出版信息

BMC Bioinformatics. 2012 Jun 26;13 Suppl 11(Suppl 11):S9. doi: 10.1186/1471-2105-13-S11-S9.

DOI:10.1186/1471-2105-13-S11-S9
PMID:22759463
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3395172/
Abstract

BACKGROUND

We explore techniques for performing model combination between the UMass and Stanford biomedical event extraction systems. Both sub-components address event extraction as a structured prediction problem, and use dual decomposition (UMass) and parsing algorithms (Stanford) to find the best scoring event structure. Our primary focus is on stacking where the predictions from the Stanford system are used as features in the UMass system. For comparison, we look at simpler model combination techniques such as intersection and union which require only the outputs from each system and combine them directly.

RESULTS

First, we find that stacking substantially improves performance while intersection and union provide no significant benefits. Second, we investigate the graph properties of event structures and their impact on the combination of our systems. Finally, we trace the origins of events proposed by the stacked model to determine the role each system plays in different components of the output. We learn that, while stacking can propose novel event structures not seen in either base model, these events have extremely low precision. Removing these novel events improves our already state-of-the-art F1 to 56.6% on the test set of Genia (Task 1). Overall, the combined system formed via stacking ("FAUST") performed well in the BioNLP 2011 shared task. The FAUST system obtained 1st place in three out of four tasks: 1st place in Genia Task 1 (56.0% F1) and Task 2 (53.9%), 2nd place in the Epigenetics and Post-translational Modifications track (35.0%), and 1st place in the Infectious Diseases track (55.6%).

CONCLUSION

We present a state-of-the-art event extraction system that relies on the strengths of structured prediction and model combination through stacking. Akin to results on other tasks, stacking outperforms intersection and union and leads to very strong results. The utility of model combination hinges on complementary views of the data, and we show that our sub-systems capture different graph properties of event structures. Finally, by removing low precision novel events, we show that performance from stacking can be further improved.

摘要

背景

我们探索了在 UMass 和斯坦福生物医学事件抽取系统之间进行模型组合的技术。这两个子组件都将事件抽取视为一个结构化预测问题,并使用对偶分解(UMass)和解析算法(斯坦福)来找到最佳评分的事件结构。我们的主要关注点是堆叠,其中斯坦福系统的预测被用作 UMass 系统的特征。为了进行比较,我们还研究了更简单的模型组合技术,例如交集和并集,它们只需要每个系统的输出,并直接对其进行组合。

结果

首先,我们发现堆叠可以极大地提高性能,而交集和并集则没有明显的好处。其次,我们研究了事件结构的图属性及其对我们系统组合的影响。最后,我们追溯了堆叠模型提出的事件的起源,以确定每个系统在输出的不同组件中所扮演的角色。我们发现,虽然堆叠可以提出在任何一个基础模型中都没有看到的新的事件结构,但这些事件的精度极低。去除这些新的事件可以将我们已经处于最先进水平的 F1 提高到 Genia(任务 1)测试集的 56.6%。总体而言,通过堆叠形成的组合系统(“FAUST”)在 BioNLP 2011 共享任务中表现出色。FAUST 系统在四个任务中的三个任务中获得了第一名:Genia 任务 1(56.0% F1)和任务 2(53.9%)、表观遗传学和翻译后修饰跟踪(35.0%),以及传染病跟踪(55.6%)。

结论

我们提出了一种基于结构化预测和通过堆叠进行模型组合的最先进的事件抽取系统。与其他任务的结果类似,堆叠的性能优于交集和并集,并取得了非常出色的结果。模型组合的效用取决于对数据的互补观点,我们表明我们的子系统捕捉到了事件结构的不同图属性。最后,通过去除低精度的新颖事件,我们表明堆叠的性能可以进一步提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/e143fccf1854/1471-2105-13-S11-S9-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/65a3d22d7683/1471-2105-13-S11-S9-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/82c37bbd6abe/1471-2105-13-S11-S9-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/346082dafe91/1471-2105-13-S11-S9-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/c2eed6e2bfd4/1471-2105-13-S11-S9-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/e143fccf1854/1471-2105-13-S11-S9-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/65a3d22d7683/1471-2105-13-S11-S9-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/82c37bbd6abe/1471-2105-13-S11-S9-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/346082dafe91/1471-2105-13-S11-S9-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/c2eed6e2bfd4/1471-2105-13-S11-S9-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/3395172/e143fccf1854/1471-2105-13-S11-S9-5.jpg

相似文献

1
Combining joint models for biomedical event extraction.联合模型在生物医学事件抽取中的应用。
BMC Bioinformatics. 2012 Jun 26;13 Suppl 11(Suppl 11):S9. doi: 10.1186/1471-2105-13-S11-S9.
2
Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011.生物自然语言处理共享任务 2011 的 ID、EPI 和 REL 任务概述。
BMC Bioinformatics. 2012 Jun 26;13 Suppl 11(Suppl 11):S2. doi: 10.1186/1471-2105-13-S11-S2.
3
University of Turku in the BioNLP'11 Shared Task.图尔库大学在 BioNLP'11 共享任务中的贡献。
BMC Bioinformatics. 2012 Jun 26;13 Suppl 11(Suppl 11):S4. doi: 10.1186/1471-2105-13-S11-S4.
4
Adaptable, high recall, event extraction system with minimal configuration.适应性强、召回率高、配置要求最低的事件提取系统。
BMC Bioinformatics. 2015;16 Suppl 10(Suppl 10):S7. doi: 10.1186/1471-2105-16-S10-S7. Epub 2015 Jul 13.
5
Structured learning for spatial information extraction from biomedical text: bacteria biotopes.从生物医学文本中提取空间信息的结构化学习:细菌生物栖息地
BMC Bioinformatics. 2015 Apr 25;16:129. doi: 10.1186/s12859-015-0542-z.
6
Optimizing graph-based patterns to extract biomedical events from the literature.优化基于图的模式以从文献中提取生物医学事件。
BMC Bioinformatics. 2015;16 Suppl 16(Suppl 16):S2. doi: 10.1186/1471-2105-16-S16-S2. Epub 2015 Oct 30.
7
From POS tagging to dependency parsing for biomedical event extraction.从词性标注到生物医学事件抽取的依存句法分析。
BMC Bioinformatics. 2019 Feb 12;20(1):72. doi: 10.1186/s12859-019-2604-0.
8
Biomedical event extraction from abstracts and full papers using search-based structured prediction.基于搜索的结构化预测在摘要和全文中进行生物医学事件抽取。
BMC Bioinformatics. 2012 Jun 26;13 Suppl 11(Suppl 11):S5. doi: 10.1186/1471-2105-13-S11-S5.
9
Biomedical event extraction based on GRU integrating attention mechanism.基于 GRU 集成注意力机制的生物医学事件抽取。
BMC Bioinformatics. 2018 Aug 13;19(Suppl 9):285. doi: 10.1186/s12859-018-2275-2.
10
Self-training in significance space of support vectors for imbalanced biomedical event data.针对不平衡生物医学事件数据在支持向量的显著性空间中进行自训练。
BMC Bioinformatics. 2015;16 Suppl 7(Suppl 7):S6. doi: 10.1186/1471-2105-16-S7-S6. Epub 2015 Apr 23.

引用本文的文献

1
UArizona at the MADE1.0 NLP Challenge.亚利桑那大学参加MADE1.0自然语言处理挑战赛。
Proc Mach Learn Res. 2018 May;90:57-65.
2
Large-scale automated machine reading discovers new cancer-driving mechanisms.大规模自动化机器阅读发现新的致癌驱动机制。
Database (Oxford). 2018 Jan 1;2018:bay098. doi: 10.1093/database/bay098.
3
Annotation and detection of drug effects in text for pharmacovigilance.用于药物警戒的文本中药物效应的标注与检测。
J Cheminform. 2018 Aug 13;10(1):37. doi: 10.1186/s13321-018-0290-y.
4
An integrated text mining framework for metabolic interaction network reconstruction.用于代谢相互作用网络重建的集成文本挖掘框架。
PeerJ. 2016 Mar 21;4:e1811. doi: 10.7717/peerj.1811. eCollection 2016.
5
Text Mining the History of Medicine.挖掘医学史
PLoS One. 2016 Jan 6;11(1):e0144717. doi: 10.1371/journal.pone.0144717. eCollection 2016.
6
Application of the EVEX resource to event extraction and network construction: Shared Task entry and result analysis.EVEX资源在事件抽取与网络构建中的应用:共享任务参赛作品及结果分析
BMC Bioinformatics. 2015;16 Suppl 16(Suppl 16):S3. doi: 10.1186/1471-2105-16-S16-S3. Epub 2015 Oct 30.
7
Extracting biomedical events from pairs of text entities.从文本实体对中提取生物医学事件。
BMC Bioinformatics. 2015;16 Suppl 10(Suppl 10):S8. doi: 10.1186/1471-2105-16-S10-S8. Epub 2015 Jul 13.
8
Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research.从文本和大规模数据分析中提取基因与疾病之间的关系:对转化研究的启示。
BMC Bioinformatics. 2015 Feb 21;16:55. doi: 10.1186/s12859-015-0472-9.
9
Biomedical relation extraction: from binary to complex.生物医学关系抽取:从二元到复杂
Comput Math Methods Med. 2014;2014:298473. doi: 10.1155/2014/298473. Epub 2014 Aug 19.
10
Augmenting microarray data with literature-based knowledge to enhance gene regulatory network inference.利用基于文献的知识增强微阵列数据,以增强基因调控网络推断。
PLoS Comput Biol. 2014 Jun 12;10(6):e1003666. doi: 10.1371/journal.pcbi.1003666. eCollection 2014 Jun.