用于归纳学习的医学文本表示

Medical text representations for inductive learning.

作者信息

Wilcox A, Hripcsak G

机构信息

Department of Medical Informatics, Columbia University, New York, NY, USA.

出版信息

Proc AMIA Symp. 2000:923-7.

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2243822/

Abstract

Inductive learning algorithms have been proposed as methods for classifying medical text reports. Many of these proposed techniques differ in the way the text is represented for use by the learning algorithms. Slight differences can occur between representations that may be chosen arbitrarily, but such differences can significantly affect classification algorithm performance. We examined 8 different data representation techniques used for medical text, and evaluated their use with standard machine learning algorithms. We measured the loss of classification-relevant information due to each representation. Representations that captured status information explicitly resulted in significantly better performance. Algorithm performance was dependent on subtle differences in data representation.

摘要

归纳学习算法已被提出作为对医学文本报告进行分类的方法。许多这些提出的技术在文本表示方式上有所不同，以便学习算法使用。在可以任意选择的表示之间可能会出现细微差异，但这些差异会显著影响分类算法的性能。我们研究了用于医学文本的8种不同的数据表示技术，并评估了它们与标准机器学习算法的结合使用情况。我们测量了每种表示方式导致的与分类相关信息的损失。明确捕获状态信息的表示方式带来了显著更好的性能。算法性能取决于数据表示中的细微差异。

相似文献

1

Medical text representations for inductive learning.

Proc AMIA Symp. 2000:923-7.

2

The role of domain knowledge in automating medical text report classification.

J Am Med Inform Assoc. 2003 Jul-Aug;10(4):330-8. doi: 10.1197/jamia.M1157. Epub 2003 Mar 28.

3

Classification algorithms applied to narrative reports.

Proc AMIA Symp. 1999:455-9.

4

Machine learning methods for clinical forms analysis in mental health.

Stud Health Technol Inform. 2013;192:1024.

5

Enhancing text categorization with semantic-enriched representation and training data augmentation.

J Am Med Inform Assoc. 2006 Sep-Oct;13(5):526-35. doi: 10.1197/jamia.M2051. Epub 2006 Jun 23.

6

Comparison of vocabularies, representations and ranking algorithms for gene prioritization by text mining.

Bioinformatics. 2008 Aug 15;24(16):i119-25. doi: 10.1093/bioinformatics/btn291.

7

Comparing expert systems for identifying chest x-ray reports that support pneumonia.

Proc AMIA Symp. 1999:216-20.

8

Active learning for clinical text classification: is it better than random sampling?

J Am Med Inform Assoc. 2012 Sep-Oct;19(5):809-16. doi: 10.1136/amiajnl-2011-000648. Epub 2012 Jun 15.

9

Text categorization of biomedical data sets using graph kernels and a controlled vocabulary.

IEEE/ACM Trans Comput Biol Bioinform. 2013 Sep-Oct;10(5):1211-7. doi: 10.1109/TCBB.2013.16.

10

Deep Learning to Classify Radiology Free-Text Reports.

Radiology. 2018 Mar;286(3):845-852. doi: 10.1148/radiol.2017171115. Epub 2017 Nov 13.

引用本文的文献

1

Looking Forward to AI and Medicine: Where Are We, and Where Are We Going?

Mo Med. 2025 Jan-Feb;122(1):34-38.

2

Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods.

JMIR Res Protoc. 2017 Aug 29;6(8):e175. doi: 10.2196/resprot.7757.

3

Using automatically extracted information from mammography reports for decision-support.

J Biomed Inform. 2016 Aug;62:224-31. doi: 10.1016/j.jbi.2016.07.001. Epub 2016 Jul 4.

4

The Yale cTAKES extensions for document classification: architecture and application.

J Am Med Inform Assoc. 2011 Sep-Oct;18(5):614-20. doi: 10.1136/amiajnl-2011-000093. Epub 2011 May 27.

5

Use of Radcube for extraction of finding trends in a large radiology practice.

J Digit Imaging. 2009 Dec;22(6):629-40. doi: 10.1007/s10278-008-9128-x. Epub 2008 Jun 10.

6

Disseminating natural language processed clinical narratives.

AMIA Annu Symp Proc. 2006;2006:126-30.

7

"Bag of words" is not enough for strength of evidence classification.

AMIA Annu Symp Proc. 2005;2005:1031.

8

The role of domain knowledge in automating medical text report classification.

J Am Med Inform Assoc. 2003 Jul-Aug;10(4):330-8. doi: 10.1197/jamia.M1157. Epub 2003 Mar 28.

9

Using narrative reports to support a digital library.

Proc AMIA Symp. 2001:458-62.

10

A knowledge model for the interpretation and visualization of NLP-parsed discharged summaries.

Proc AMIA Symp. 2001:339-43.

本文引用的文献

1

Classification algorithms applied to narrative reports.

Proc AMIA Symp. 1999:455-9.

2

Comparing expert systems for identifying chest x-ray reports that support pneumonia.

Proc AMIA Symp. 1999:216-20.

3

Automatic identification of pneumonia related concepts on chest x-ray reports.

Proc AMIA Symp. 1999:67-71.

4

Ad hoc classification of radiology reports.

J Am Med Inform Assoc. 1999 Sep-Oct;6(5):393-411. doi: 10.1136/jamia.1999.0060393.

5

Distinction between planned and unplanned readmissions following discharge from a Department of Internal Medicine.

Methods Inf Med. 1999 Jun;38(2):140-3.

6

Automatic prediction of trauma registry procedure codes from emergency room dictations.

Stud Health Technol Inform. 1998;52 Pt 1:665-9.

7

Knowledge discovery and data mining to assist natural language understanding.

Proc AMIA Symp. 1998:835-9.

8

Puya: a method of attracting attention to relevant physical findings.

Proc AMIA Annu Fall Symp. 1997:509-13.

9

A simulation study of the number of events per variable in logistic regression analysis.

J Clin Epidemiol. 1996 Dec;49(12):1373-9. doi: 10.1016/s0895-4356(96)00236-3.

10

Identification of suspected tuberculosis patients based on natural language processing of chest radiograph reports.

Proc AMIA Annu Fall Symp. 1996:542-6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。