医学术语系统评估方法——文献综述与案例研究

Methods for evaluation of medical terminological systems--a literature review and a case study.

作者信息

Arts D G T, Cornet R, de Jonge E, de Keizer N F

机构信息

Academic Medical Center, Department of Medical Informatics. P.O. Box 22700, 1100 DE Amsterdam, The Netherlands.

出版信息

Methods Inf Med. 2005;44(5):616-25.

PMID:16400369

Abstract

OBJECTIVES

The usability of terminological systems (TSs) strongly depends on the coverage and correctness of their content. The objective of this study was to create a literature overview of aspects related to the content of TSs and of methods for the evaluation of the content of TSs. The extent to which these methods overlap or complement each other is investigated.

METHODS

We reviewed literature and composed definitions for aspects of the evaluation of the content of TSs. Of the methods described in literature three were selected: 1) Concept matching in which two samples of concepts representing a) documentation of reasons for admission in daily care practice and b) aggregation of patient groups for research, are looked up in the TS in order to assess its coverage; 2) Formal algorithmic evaluation in which reasoning on the formally represented content is used to detect inconsistencies; and 3) Expert review in which a random sample of concepts are checked for incorrect and incomplete terms and relations. These evaluation methods were applied in a case study on the locally developed TS DICE (Diagnoses for Intensive Care Evaluation).

RESULTS

None of the applied methods covered all the aspects of the content of a TS. The results of concept matching differed for the two use cases (63% vs. 52% perfect matches). Expert review revealed many more errors and incompleteness than formal algorithmic evaluation.

CONCLUSIONS

To evaluate the content of a TS, using a combination of evaluation methods is preferable. Different representative samples, reflecting the uses of TSs, lead to different results for concept matching. Expert review appears to be very valuable, but time consuming. Formal algorithmic evaluation has the potential to decrease the workload of human reviewers but detects only logical inconsistencies. Further research is required to exploit the potentials of formal algorithmic evaluation.

摘要

目的

术语系统（TSs）的可用性在很大程度上取决于其内容的覆盖范围和正确性。本研究的目的是对与TSs内容相关的方面以及TSs内容评估方法进行文献综述。研究这些方法相互重叠或补充的程度。

方法

我们回顾了文献并为TSs内容评估的各个方面编写了定义。从文献中描述的方法中选择了三种：1）概念匹配，即在TS中查找分别代表a）日常护理实践中入院原因记录和b）研究患者群体汇总的两个概念样本，以评估其覆盖范围；2）形式算法评估，即利用对形式化表示内容的推理来检测不一致性；3）专家评审，即对随机抽取的概念样本检查是否存在不正确和不完整的术语及关系。这些评估方法应用于对本地开发的TS DICE（重症监护评估诊断）的案例研究。