• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

生物医学文本中指代消解错误的分类分析。

A categorical analysis of coreference resolution errors in biomedical texts.

作者信息

Choi Miji, Zobel Justin, Verspoor Karin

机构信息

Department of Computing and Information Systems, The University of Melbourne, Melbourne, Australia; National ICT Australia (NICTA), Victoria Research Laboratory, Australia.

Department of Computing and Information Systems, The University of Melbourne, Melbourne, Australia.

出版信息

J Biomed Inform. 2016 Apr;60:309-18. doi: 10.1016/j.jbi.2016.02.015. Epub 2016 Feb 27.

DOI:10.1016/j.jbi.2016.02.015
PMID:26925515
Abstract

BACKGROUND

Coreference resolution is an essential task in information extraction from the published biomedical literature. It supports the discovery of complex information by linking referring expressions such as pronouns and appositives to their referents, which are typically entities that play a central role in biomedical events. Correctly establishing these links allows detailed understanding of all the participants in events, and connecting events together through their shared participants.

RESULTS

As an initial step towards the development of a novel coreference resolution system for the biomedical domain, we have categorised the characteristics of coreference relations by type of anaphor as well as broader syntactic and semantic characteristics, and have compared the performance of a domain adaptation of a state-of-the-art general system to published results from domain-specific systems in terms of this categorisation. We also develop a rule-based system for anaphoric coreference resolution in the biomedical domain with simple modules derived from available systems. Our results show that the domain-specific systems outperform the general system overall. Whilst this result is unsurprising, our proposed categorisation enables a detailed quantitative analysis of the system performance. We identify limitations of each system and find that there remain important gaps in the state-of-the-art systems, which are clearly identifiable with respect to the categorisation.

CONCLUSION

We have analysed in detail the performance of existing coreference resolution systems for the biomedical literature and have demonstrated that there clear gaps in their coverage. The approach developed in the general domain needs to be tailored for portability to the biomedical domain. The specific framework for class-based error analysis of existing systems that we propose has benefits for identifying specific limitations of those systems. This in turn provides insights for further system development.

摘要

背景

指代消解是从已发表的生物医学文献中提取信息的一项重要任务。它通过将代词和同位语等指代性表达与其所指对象相链接来支持复杂信息的发现,这些所指对象通常是在生物医学事件中起核心作用的实体。正确建立这些链接有助于详细了解事件中的所有参与者,并通过共享参与者将事件联系起来。

结果

作为开发用于生物医学领域的新型指代消解系统的第一步,我们按指代类型以及更广泛的句法和语义特征对指代关系的特征进行了分类,并根据这种分类比较了一个先进通用系统的领域适应性与特定领域系统已发表结果的性能。我们还开发了一个基于规则的生物医学领域指代消解系统,其简单模块源自现有系统。我们的结果表明,特定领域系统总体上优于通用系统。虽然这一结果并不意外,但我们提出的分类方法能够对系统性能进行详细的定量分析。我们确定了每个系统的局限性,并发现现有系统中仍存在重要差距,这些差距根据分类是清晰可辨的。

结论

我们详细分析了现有生物医学文献指代消解系统的性能,并证明了它们在覆盖范围上存在明显差距。通用领域开发的方法需要进行调整以便于移植到生物医学领域。我们提出的针对现有系统基于类的错误分析的具体框架有助于识别这些系统的特定局限性。这反过来为进一步的系统开发提供了见解。

相似文献

1
A categorical analysis of coreference resolution errors in biomedical texts.生物医学文本中指代消解错误的分类分析。
J Biomed Inform. 2016 Apr;60:309-18. doi: 10.1016/j.jbi.2016.02.015. Epub 2016 Feb 27.
2
Bio-SCoRes: A Smorgasbord Architecture for Coreference Resolution in Biomedical Text.生物共指消解评分系统(Bio-SCoRes):一种用于生物医学文本共指消解的混合架构
PLoS One. 2016 Mar 2;11(3):e0148538. doi: 10.1371/journal.pone.0148538. eCollection 2016.
3
Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles.科罗拉多生物医学期刊文章丰富注释全文(CRAFT)语料库中的共指标注与消解
BMC Bioinformatics. 2017 Aug 17;18(1):372. doi: 10.1186/s12859-017-1775-9.
4
Minimalistic Approach to Coreference Resolution in Lithuanian Medical Records.立陶宛语病历中指代消解的极简方法。
Comput Math Methods Med. 2019 Mar 20;2019:9079840. doi: 10.1155/2019/9079840. eCollection 2019.
5
Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives.利用领域知识和领域启发的语篇模型解决临床叙述中的共指消解问题。
J Am Med Inform Assoc. 2013 Mar-Apr;20(2):356-62. doi: 10.1136/amiajnl-2011-000767. Epub 2012 Jul 10.
6
MCORES: a system for noun phrase coreference resolution for clinical records.MCORES:用于临床记录中名词短语共指消解的系统。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):906-12. doi: 10.1136/amiajnl-2011-000591. Epub 2012 Mar 14.
7
Lexical patterns, features and knowledge resources for coreference resolution in clinical notes.临床笔记中用于指代消解的词汇模式、特征和知识资源。
J Biomed Inform. 2012 Oct;45(5):901-12. doi: 10.1016/j.jbi.2012.02.012. Epub 2012 Mar 17.
8
A classification approach to coreference in discharge summaries: 2011 i2b2 challenge.一种用于出院小结中核心参照的分类方法:2011 i2b2 挑战赛。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):897-905. doi: 10.1136/amiajnl-2011-000734. Epub 2012 Apr 13.
9
Coreference analysis in clinical notes: a multi-pass sieve with alternate anaphora resolution modules.临床记录中的共指分析:一种带有交替回指解析模块的多遍筛选方法。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):867-74. doi: 10.1136/amiajnl-2011-000766. Epub 2012 Jun 16.
10
EUSKOR: End-to-end coreference resolution system for Basque.EUSKOR:巴斯克语端到端共指消解系统。
PLoS One. 2019 Sep 12;14(9):e0221801. doi: 10.1371/journal.pone.0221801. eCollection 2019.

引用本文的文献

1
Distinguished representation of identical mentions in bio-entity coreference resolution.生物实体共指消解中相同提及的出色表示。
BMC Med Inform Decis Mak. 2022 Apr 30;22(1):116. doi: 10.1186/s12911-022-01862-1.
2
A set of domain rules and a deep network for protein coreference resolution.一组用于蛋白质共指解析的领域规则和深度网络。
Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay065.
3
Coreference resolution improves extraction of Biological Expression Language statements from texts.共指消解可改善从文本中提取生物表达语言语句的效果。
Database (Oxford). 2016 Jul 3;2016. doi: 10.1093/database/baw076. Print 2016.
4
A crowdsourcing workflow for extracting chemical-induced disease relations from free text.一种用于从自由文本中提取化学物质诱发疾病关系的众包工作流程。
Database (Oxford). 2016 Apr 17;2016. doi: 10.1093/database/baw051. Print 2016.