用于断言分类的机器学习和基于规则的方法。

Machine learning and rule-based approaches to assertion classification.

作者信息

Uzuner Ozlem, Zhang Xiaoran, Sibanda Tawanda

机构信息

Information Studies, State Unviersity of New York, Albany, NY, USA.

出版信息

J Am Med Inform Assoc. 2009 Jan-Feb;16(1):109-15. doi: 10.1197/jamia.M2950. Epub 2008 Oct 24.

DOI:10.1197/jamia.M2950

PMID:18952931

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2605605/

Abstract

OBJECTIVES

The authors study two approaches to assertion classification. One of these approaches, Extended NegEx (ENegEx), extends the rule-based NegEx algorithm to cover alter-association assertions; the other, Statistical Assertion Classifier (StAC), presents a machine learning solution to assertion classification.

DESIGN

For each mention of each medical problem, both approaches determine whether the problem, as asserted by the context of that mention, is present, absent, or uncertain in the patient, or associated with someone other than the patient. The authors use these two systems to (1) extend negation and uncertainty extraction to recognition of alter-association assertions, (2) determine the contribution of lexical and syntactic context to assertion classification, and (3) test if a machine learning approach to assertion classification can be as generally applicable and useful as its rule-based counterparts.

MEASUREMENTS

The authors evaluated assertion classification approaches with precision, recall, and F-measure.

RESULTS

The ENegEx algorithm is a general algorithm that can be directly applied to new corpora. Despite being based on machine learning, StAC can also be applied out-of-the-box to new corpora and achieve similar generality.

CONCLUSION

The StAC models that are developed on discharge summaries can be successfully applied to radiology reports. These models benefit the most from words found in the +/- 4 word window of the target and can outperform ENegEx.

摘要

目的

作者研究了两种断言分类方法。其中一种方法，扩展否定词检测法（ENegEx），将基于规则的否定词检测算法进行扩展，以涵盖替代关联断言；另一种方法，统计断言分类器（StAC），提出了一种用于断言分类的机器学习解决方案。

设计

对于每个医学问题的每次提及，两种方法都要确定在该提及的上下文中所断言的问题在患者身上是存在、不存在、不确定，还是与患者以外的其他人相关。作者使用这两个系统来（1）将否定和不确定性提取扩展到替代关联断言的识别，（2）确定词汇和句法上下文对断言分类的贡献，以及（3）测试用于断言分类的机器学习方法是否能与基于规则的方法一样普遍适用且有用。

测量

作者用精确率、召回率和F值来评估断言分类方法。

结果

ENegEx算法是一种可直接应用于新语料库的通用算法。尽管StAC基于机器学习，但它也可以直接应用于新语料库并实现类似的通用性。

结论

在出院小结上开发的StAC模型可以成功应用于放射学报告。这些模型从目标词前后4个词窗口内的词中受益最大，并且性能优于ENegEx。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于断言分类的机器学习和基于规则的方法。

Machine learning and rule-based approaches to assertion classification.

作者信息

机构信息

出版信息

OBJECTIVES

DESIGN

MEASUREMENTS

RESULTS

CONCLUSION

目的

设计

测量

结果

结论

相似文献

引用本文的文献

本文引用的文献

用于断言分类的机器学习和基于规则的方法。

Machine learning and rule-based approaches to assertion classification.

作者信息

机构信息

出版信息

OBJECTIVES

DESIGN

MEASUREMENTS

RESULTS

CONCLUSION

目的

设计

测量

结果

结论

相似文献

引用本文的文献

本文引用的文献