利用领域知识和领域启发的语篇模型解决临床叙述中的共指消解问题。

Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives.

机构信息

Department of Computer Science, UIUC, Urbana, IL 61801, USA.

出版信息

J Am Med Inform Assoc. 2013 Mar-Apr;20(2):356-62. doi: 10.1136/amiajnl-2011-000767. Epub 2012 Jul 10.

DOI:10.1136/amiajnl-2011-000767

PMID:22781192

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3638172/

Abstract

OBJECTIVE

This paper presents a coreference resolution system for clinical narratives. Coreference resolution aims at clustering all mentions in a single document to coherent entities.

MATERIALS AND METHODS

A knowledge-intensive approach for coreference resolution is employed. The domain knowledge used includes several domain-specific lists, a knowledge intensive mention parsing, and task informed discourse model. Mention parsing allows us to abstract over the surface form of the mention and represent each mention using a higher-level representation, which we call the mention's semantic representation (SR). SR reduces the mention to a standard form and hence provides better support for comparing and matching. Existing coreference resolution systems tend to ignore discourse aspects and rely heavily on lexical and structural cues in the text. The authors break from this tradition and present a discourse model for "person" type mentions in clinical narratives, which greatly simplifies the coreference resolution.

RESULTS

This system was evaluated on four different datasets which were made available in the 2011 i2b2/VA coreference challenge. The unweighted average of F1 scores (over B-cubed, MUC and CEAF) varied from 84.2% to 88.1%. These experiments show that domain knowledge is effective for different mention types for all the datasets.

DISCUSSION

Error analysis shows that most of the recall errors made by the system can be handled by further addition of domain knowledge. The precision errors, on the other hand, are more subtle and indicate the need to understand the relations in which mentions participate for building a robust coreference system.

CONCLUSION

This paper presents an approach that makes an extensive use of domain knowledge to significantly improve coreference resolution. The authors state that their system and the knowledge sources developed will be made publicly available.

摘要

目的

本文提出了一种针对临床叙述的共指消解系统。共指消解旨在将单一文档中的所有提及聚类为连贯的实体。

材料与方法

采用了一种知识密集型的共指消解方法。所使用的领域知识包括几个特定领域的列表、知识密集型提及解析和任务通知的话语模型。提及解析使我们能够抽象出提及的表面形式，并使用更高层次的表示来表示每个提及，我们称之为提及的语义表示（SR）。SR 将提及简化为标准形式，从而为比较和匹配提供更好的支持。现有的共指消解系统往往忽略话语方面，严重依赖文本中的词汇和结构线索。作者打破了这一传统，提出了一种针对临床叙述中“人”类型提及的话语模型，这大大简化了共指消解。

结果

该系统在 2011 年 i2b2/VA 共指挑战中提供的四个不同数据集上进行了评估。F1 分数（在 B-cubed、MUC 和 CEAF 上）的未加权平均值从 84.2%到 88.1%不等。这些实验表明，领域知识对于所有数据集的不同提及类型都是有效的。

讨论

错误分析表明，系统造成的大多数召回错误可以通过进一步添加领域知识来解决。另一方面，精度错误则更为微妙，表明需要理解提及参与的关系，以构建一个稳健的共指系统。

结论

本文提出了一种方法，该方法广泛使用领域知识，显著提高了共指消解的性能。作者表示，他们的系统和开发的知识来源将公开发布。

相似文献

Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives.

J Am Med Inform Assoc. 2013 Mar-Apr;20(2):356-62. doi: 10.1136/amiajnl-2011-000767. Epub 2012 Jul 10.

A supervised framework for resolving coreference in clinical records.

J Am Med Inform Assoc. 2012 Sep-Oct;19(5):875-82. doi: 10.1136/amiajnl-2012-000810. Epub 2012 May 19.

A classification approach to coreference in discharge summaries: 2011 i2b2 challenge.

J Am Med Inform Assoc. 2012 Sep-Oct;19(5):897-905. doi: 10.1136/amiajnl-2011-000734. Epub 2012 Apr 13.

Bio-SCoRes: A Smorgasbord Architecture for Coreference Resolution in Biomedical Text.

PLoS One. 2016 Mar 2;11(3):e0148538. doi: 10.1371/journal.pone.0148538. eCollection 2016.

A categorical analysis of coreference resolution errors in biomedical texts.

J Biomed Inform. 2016 Apr;60:309-18. doi: 10.1016/j.jbi.2016.02.015. Epub 2016 Feb 27.

Towards generalizable entity-centric clinical coreference resolution.

J Biomed Inform. 2017 May;69:251-258. doi: 10.1016/j.jbi.2017.04.015. Epub 2017 Apr 21.

Distinguished representation of identical mentions in bio-entity coreference resolution.

BMC Med Inform Decis Mak. 2022 Apr 30;22(1):116. doi: 10.1186/s12911-022-01862-1.

Evaluating the state of the art in coreference resolution for electronic medical records.

J Am Med Inform Assoc. 2012 Sep-Oct;19(5):786-91. doi: 10.1136/amiajnl-2011-000784. Epub 2012 Feb 24.

MCORES: a system for noun phrase coreference resolution for clinical records.

J Am Med Inform Assoc. 2012 Sep-Oct;19(5):906-12. doi: 10.1136/amiajnl-2011-000591. Epub 2012 Mar 14.

Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles.

BMC Bioinformatics. 2017 Aug 17;18(1):372. doi: 10.1186/s12859-017-1775-9.

引用本文的文献

Collecting specialty-related medical terms: Development and evaluation of a resource for Spanish.

BMC Med Inform Decis Mak. 2021 May 4;21(1):145. doi: 10.1186/s12911-021-01495-w.

Document Sublanguage Clustering to Detect Medical Specialty in Cross-institutional Clinical Texts.

Proc ACM Int Workshop Data Text Min Biomed Inform. 2013 Oct-Nov;2013:9-12. doi: 10.1145/2512089.2512101.

Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis.

Yearb Med Inform. 2015 Aug 13;10(1):183-93. doi: 10.15265/IY-2015-009.

Mission and Sustainability of Informatics for Integrating Biology and the Bedside (i2b2).

EGEMS (Wash DC). 2014 Sep 11;2(2):1074. doi: 10.13063/2327-9214.1074. eCollection 2014.

"Big data" and the electronic health record.

Yearb Med Inform. 2014 Aug 15;9(1):97-104. doi: 10.15265/IY-2014-0003.

本文引用的文献

MCORES: a system for noun phrase coreference resolution for clinical records.

J Am Med Inform Assoc. 2012 Sep-Oct;19(5):906-12. doi: 10.1136/amiajnl-2011-000591. Epub 2012 Mar 14.

Evaluating the state of the art in coreference resolution for electronic medical records.

J Am Med Inform Assoc. 2012 Sep-Oct;19(5):786-91. doi: 10.1136/amiajnl-2011-000784. Epub 2012 Feb 24.

2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.

J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6. doi: 10.1136/amiajnl-2011-000203. Epub 2011 Jun 16.

Extracting medication information from clinical text.

J Am Med Inform Assoc. 2010 Sep-Oct;17(5):514-8. doi: 10.1136/jamia.2010.003947.

Semantic relations for problem-oriented medical records.

Artif Intell Med. 2010 Oct;50(2):63-73. doi: 10.1016/j.artmed.2010.05.006. Epub 2010 Jun 19.

An overview of MetaMap: historical perspective and recent advances.

J Am Med Inform Assoc. 2010 May-Jun;17(3):229-36. doi: 10.1136/jamia.2009.002733.

Synonym set extraction from the biomedical literature by lexical pattern discovery.

BMC Bioinformatics. 2008 Mar 24;9:159. doi: 10.1186/1471-2105-9-159.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用领域知识和领域启发的语篇模型解决临床叙述中的共指消解问题。

Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

DISCUSSION

CONCLUSION

目的

材料与方法

结果

讨论

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献