Suppr超能文献

临床叙述的共指消解系统。

A system for coreference resolution for the clinical narrative.

机构信息

Children's Hospital Boston and Harvard Medical School, Boston, Massachusetts 02114, USA.

出版信息

J Am Med Inform Assoc. 2012 Jul-Aug;19(4):660-7. doi: 10.1136/amiajnl-2011-000599. Epub 2012 Jan 31.

Abstract

OBJECTIVE

To research computational methods for coreference resolution in the clinical narrative and build a system implementing the best methods.

METHODS

The Ontology Development and Information Extraction corpus annotated for coreference relations consists of 7214 coreferential markables, forming 5992 pairs and 1304 chains. We trained classifiers with semantic, syntactic, and surface features pruned by feature selection. For the three system components--for the resolution of relative pronouns, personal pronouns, and noun phrases--we experimented with support vector machines with linear and radial basis function (RBF) kernels, decision trees, and perceptrons. Evaluation of algorithms and varied feature sets was performed using standard metrics.

RESULTS

The best performing combination is support vector machines with an RBF kernel and all features (MUC score=0.352, B(3)=0.690, CEAF=0.486, BLANC=0.596) outperforming a traditional decision tree baseline.

DISCUSSION

The application showed good performance similar to performance on general English text. The main error source was sentence distances exceeding a window of 10 sentences between markables. A possible solution to this problem is hinted at by the fact that coreferent markables sometimes occurred in predictable (although distant) note sections. Another system limitation is failure to fully utilize synonymy and ontological knowledge. Future work will investigate additional ways to incorporate syntactic features into the coreference problem.

CONCLUSION

We investigated computational methods for coreference resolution in the clinical narrative. The best methods are released as modules of the open source Clinical Text Analysis and Knowledge Extraction System and Ontology Development and Information Extraction platforms.

摘要

目的

研究临床医学文献中代词消解的计算方法,并构建一个实现最佳方法的系统。

方法

本体开发和信息抽取语料库中的共指关系经过标注,包含 7214 个共指标记,形成 5992 对和 1304 条链。我们使用语义、句法和表面特征训练分类器,并通过特征选择进行修剪。对于相对代词、人称代词和名词短语这三个系统组件,我们尝试了使用线性和径向基函数(RBF)核的支持向量机、决策树和感知器。使用标准指标对算法和不同的特征集进行评估。

结果

性能最佳的组合是使用 RBF 核和所有特征的支持向量机(MUC 得分=0.352,B(3)=0.690,CEAF=0.486,BLANC=0.596),优于传统的决策树基线。

讨论

该应用程序表现出与一般英语文本相似的良好性能。主要的错误来源是标记之间的句子距离超过 10 个句子的窗口。解决这个问题的一个可能方法是,共指标记有时出现在可预测的(尽管距离较远)笔记部分。另一个系统限制是未能充分利用同义词和本体知识。未来的工作将研究将句法特征纳入共指问题的其他方法。

结论

我们研究了临床医学文献中代词消解的计算方法。最佳方法作为开源临床文本分析和知识提取系统以及本体开发和信息抽取平台的模块发布。

相似文献

1
A system for coreference resolution for the clinical narrative.临床叙述的共指消解系统。
J Am Med Inform Assoc. 2012 Jul-Aug;19(4):660-7. doi: 10.1136/amiajnl-2011-000599. Epub 2012 Jan 31.
3
A supervised framework for resolving coreference in clinical records.一种用于解决临床记录中共指消解问题的有监督框架。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):875-82. doi: 10.1136/amiajnl-2012-000810. Epub 2012 May 19.
9
MCORES: a system for noun phrase coreference resolution for clinical records.MCORES:用于临床记录中名词短语共指消解的系统。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):906-12. doi: 10.1136/amiajnl-2011-000591. Epub 2012 Mar 14.

引用本文的文献

2
Natural Language Processing for EHR-Based Computational Phenotyping.基于电子健康记录的自然语言处理计算表型。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jan-Feb;16(1):139-153. doi: 10.1109/TCBB.2018.2849968. Epub 2018 Jun 25.
5
Towards generalizable entity-centric clinical coreference resolution.迈向可泛化的以实体为中心的临床共指消解
J Biomed Inform. 2017 May;69:251-258. doi: 10.1016/j.jbi.2017.04.015. Epub 2017 Apr 21.

本文引用的文献

1
The MiPACQ clinical question answering system.MiPACQ临床问答系统。
AMIA Annu Symp Proc. 2011;2011:171-80. Epub 2011 Oct 22.
3
Anaphoric relations in the clinical narrative: corpus creation.临床叙述中的回指关系:语料库创建。
J Am Med Inform Assoc. 2011 Jul-Aug;18(4):459-65. doi: 10.1136/amiajnl-2011-000108. Epub 2011 Apr 1.
5
Building a semantically annotated corpus of clinical texts.构建临床文本语义标注语料库。
J Biomed Inform. 2009 Oct;42(5):950-66. doi: 10.1016/j.jbi.2008.12.013. Epub 2009 Jan 23.
8
Exploring semantic groups through visual approaches.通过视觉方法探索语义群组。
J Biomed Inform. 2003 Dec;36(6):414-32. doi: 10.1016/j.jbi.2003.11.002.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验