Suppr超能文献

基于条件随机场的生物医学否定范围检测。

Biomedical negation scope detection with conditional random fields.

机构信息

Medical Informatics, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA.

出版信息

J Am Med Inform Assoc. 2010 Nov-Dec;17(6):696-701. doi: 10.1136/jamia.2010.003228.

Abstract

OBJECTIVE

Negation is a linguistic phenomenon that marks the absence of an entity or event. Negated events are frequently reported in both biological literature and clinical notes. Text mining applications benefit from the detection of negation and its scope. However, due to the complexity of language, identifying the scope of negation in a sentence is not a trivial task.

DESIGN

Conditional random fields (CRF), a supervised machine-learning algorithm, were used to train models to detect negation cue phrases and their scope in both biological literature and clinical notes. The models were trained on the publicly available BioScope corpus.

MEASUREMENT

The performance of the CRF models was evaluated on identifying the negation cue phrases and their scope by calculating recall, precision and F1-score. The models were compared with four competitive baseline systems.

RESULTS

The best CRF-based model performed statistically better than all baseline systems and NegEx, achieving an F1-score of 98% and 95% on detecting negation cue phrases and their scope in clinical notes, and an F1-score of 97% and 85% on detecting negation cue phrases and their scope in biological literature.

CONCLUSIONS

This approach is robust, as it can identify negation scope in both biological and clinical text. To benefit text mining applications, the system is publicly available as a Java API and as an online application at http://negscope.askhermes.org.

摘要

目的

否定是一种语言现象,用于标记实体或事件的不存在。否定事件在生物文献和临床记录中经常被报道。文本挖掘应用程序受益于否定及其范围的检测。然而,由于语言的复杂性,确定句子中的否定范围并不是一项简单的任务。

设计

条件随机场(CRF)是一种监督机器学习算法,用于训练模型来检测生物文献和临床记录中的否定提示短语及其范围。这些模型是在公开的 BioScope 语料库上进行训练的。

测量

通过计算召回率、精度和 F1 分数来评估 CRF 模型识别否定提示短语及其范围的性能。将这些模型与四个竞争基线系统进行了比较。

结果

基于 CRF 的最佳模型在识别否定提示短语及其范围方面的性能明显优于所有基线系统和 NegEx,在识别临床记录中的否定提示短语及其范围方面的 F1 得分为 98%和 95%,在识别生物文献中的否定提示短语及其范围方面的 F1 得分为 97%和 85%。

结论

该方法具有稳健性,因为它可以识别生物和临床文本中的否定范围。为了使文本挖掘应用程序受益,该系统以 Java API 的形式和在线应用程序(http://negscope.askhermes.org)的形式提供。

相似文献

1
3
Automatic discourse connective detection in biomedical text.生物医学文本中的自动语篇连接词检测。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):800-8. doi: 10.1136/amiajnl-2011-000775. Epub 2012 Jun 28.
8
Negated bio-events: analysis and identification.否定的生物事件:分析与识别。
BMC Bioinformatics. 2013 Jan 16;14:14. doi: 10.1186/1471-2105-14-14.

引用本文的文献

4
Electronic Health Record (EHR) Abstraction.电子健康记录(EHR)提取
Perspect Health Inf Manag. 2021 Mar 15;18(Spring):1g. eCollection 2021 Spring.
5
EMR2vec: Bridging the gap between patient data and clinical trial.EMR2vec:弥合患者数据与临床试验之间的差距。
Comput Ind Eng. 2021 Jun;156:107236. doi: 10.1016/j.cie.2021.107236. Epub 2021 Mar 15.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验