Suppr超能文献

一项比较从临床自由文本中提取医学主题词的词汇法和统计法的实验。

An experiment comparing lexical and statistical methods for extracting MeSH terms from clinical free text.

作者信息

Cooper G F, Miller R A

机构信息

Center for Biomedical Informatics, University of Pittsburgh, PA 15213-2582, USA.

出版信息

J Am Med Inform Assoc. 1998 Jan-Feb;5(1):62-75. doi: 10.1136/jamia.1998.0050062.

Abstract

OBJECTIVE

A primary goal of the University of Pittsburgh's 1990-94 UMLS-sponsored effort was to develop and evaluate PostDoc (a lexical indexing system) and Pindex (a statistical indexing system) comparatively, and then in combination as a hybrid system. Each system takes as input a portion of the free text from a narrative part of a patient's electronic medical record and returns a list of suggested MeSH terms to use in formulating a Medline search that includes concepts in the text. This paper describes the systems and reports an evaluation. The intent is for this evaluation to serve as a step toward the eventual realization of systems that assist healthcare personnel in using the electronic medical record to construct patient-specific searches of Medline.

DESIGN

The authors tested the performances of PostDoc, Pindex, and a hybrid system, using text taken from randomly selected clinical records, which were stratified to include six radiology reports, six pathology reports, and six discharge summaries. They identified concepts in the clinical records that might conceivably be used in performing a patient-specific Medline search. Each system was given the free text of each record as an input. The extent to which a system-derived list of MeSH terms captured the relevant concepts in these documents was determined based on blinded assessments by the authors.

RESULTS

PostDoc output a mean of approximately 19 MeSH terms per report, which included about 40% of the relevant report concepts. Pindex output a mean of approximately 57 terms per report and captured about 45% of the relevant report concepts. A hybrid system captured approximately 66% of the relevant concepts and output about 71 terms per report.

CONCLUSION

The outputs of PostDoc and Pindex are complementary in capturing MeSH terms from clinical free text. The results suggest possible approaches to reduce the number of terms output while maintaining the percentage of terms captured, including the use of UMLS semantic types to constrain the output list to contain only clinically relevant MeSH terms.

摘要

目的

匹兹堡大学在1990 - 1994年由统一医学语言系统(UMLS)资助的工作的一个主要目标是对PostDoc(一种词汇索引系统)和Pindex(一种统计索引系统)进行比较开发和评估,然后将它们组合成一个混合系统。每个系统将患者电子病历叙述部分的一部分自由文本作为输入,并返回一份建议的医学主题词(MeSH)列表,用于制定包含文本中概念的医学文献数据库(Medline)搜索。本文描述了这些系统并报告了一项评估。该评估的目的是作为迈向最终实现协助医护人员利用电子病历构建针对特定患者的Medline搜索系统的一步。

设计

作者使用从随机选择的临床记录中提取的文本测试了PostDoc、Pindex和一个混合系统的性能,这些临床记录被分层以包括六份放射学报告、六份病理学报告和六份出院小结。他们确定了临床记录中可能用于执行针对特定患者的Medline搜索的概念。每个系统都将每份记录的自由文本作为输入。基于作者的盲法评估,确定系统生成的MeSH词列表捕获这些文档中相关概念的程度。

结果

PostDoc每份报告平均输出约19个MeSH词,其中包括约40%的相关报告概念。Pindex每份报告平均输出约57个词,并捕获了约45%的相关报告概念。一个混合系统捕获了约66%的相关概念,每份报告输出约71个词。

结论

PostDoc和Pindex的输出在从临床自由文本中捕获MeSH词方面是互补的。结果表明了在保持捕获词的百分比的同时减少输出词数量的可能方法,包括使用UMLS语义类型来限制输出列表仅包含临床相关的MeSH词。

相似文献

3
MeSH indexing based on automatically generated summaries.基于自动生成的摘要进行 MeSH 标引。
BMC Bioinformatics. 2013 Jun 26;14:208. doi: 10.1186/1471-2105-14-208.

引用本文的文献

2
Terminology extraction from medical texts in Polish.从波兰语医学文本中提取术语。
J Biomed Semantics. 2014 May 31;5:24. doi: 10.1186/2041-1480-5-24. eCollection 2014.
6
Automatic Indexing of Documents from Journal Descriptors: A Preliminary Investigation.基于期刊描述符的文档自动索引:初步调查
J Am Soc Inf Sci. 1999;50(8):661-674. doi: 10.1002/(SICI)1097-4571(1999)50:8<661::AID-ASI4>3.0.CO;2-R.
9
Empirical distributional semantics: methods and biomedical applications.实证分布语义学:方法与生物医学应用
J Biomed Inform. 2009 Apr;42(2):390-405. doi: 10.1016/j.jbi.2009.02.002. Epub 2009 Feb 14.

本文引用的文献

3
The Unified Medical Language System.统一医学语言系统
Methods Inf Med. 1993 Aug;32(4):281-91. doi: 10.1055/s-0038-1634945.
8
Natural language processing and the representation of clinical data.自然语言处理与临床数据的表示
J Am Med Inform Assoc. 1994 Mar-Apr;1(2):142-60. doi: 10.1136/jamia.1994.95236145.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验