一种用于专家系统开发的知识获取归纳算法方法。一项初步研究。

An inductive algorithm approach to knowledge acquisition for expert system development. A pilot study.

作者信息

Henry S B

机构信息

University of California, San Francisco 94143-0608, USA.

出版信息

Comput Nurs. 1995 Sep-Oct;13(5):226-32.

PMID:7585305

Abstract

Knowledge acquisition, which consists of knowledge elicitation and knowledge representation, often is considered the weakest link in the design of expert systems. Systems frequently are built on the knowledge of one expert and require extensive use of knowledge engineering techniques to elicit this knowledge from the expert. Inductive algorithms are a potential alternative method of knowledge acquisition for expert system development. The aim of this pilot study was to examine the feasibility of applying machine learning techniques, specifically, inductive algorithms, to an existing research database as a method for knowledge elicitation and knowledge representation for expert system development. Two inductive algorithms (C4 and Classification and Regression Trees [CART]) that generate decision trees were selected for the analysis using a data set of 201 patients hospitalized for Pneumocystis carinii pneumonia. Neither C4 nor CART produced trees with an accuracy that was significantly better than the baseline accuracy (71.3%) for prediction of outcome in the data set. The mean accuracy of the C4 decision trees was below baseline and the mean accuracy of CART decision trees was 74.6%. The experts found both algorithms comprehensible, but not adequate, and identified important missing predictor variables. The study findings suggest that additional research is needed to examine the appropriate use of inductive algorithms in the transformation of nursing data and information into nursing knowledge.

摘要

知识获取由知识引出和知识表示组成，通常被认为是专家系统设计中最薄弱的环节。系统常常基于一位专家的知识构建，并且需要广泛运用知识工程技术从专家那里引出这些知识。归纳算法是用于专家系统开发的一种潜在的知识获取替代方法。这项初步研究的目的是检验将机器学习技术，具体而言是归纳算法，应用于现有的研究数据库作为专家系统开发的知识引出和知识表示方法的可行性。使用201例因卡氏肺孢子虫肺炎住院患者的数据集，选择两种生成决策树的归纳算法（C4和分类与回归树 [CART]）进行分析。对于数据集中结果的预测，C4和CART生成的树的准确率均未显著高于基线准确率（71.3%）。C4决策树的平均准确率低于基线，CART决策树的平均准确率为74.6%。专家们发现这两种算法都易于理解，但并不充分，并识别出了重要的缺失预测变量。研究结果表明，需要进一步研究以检验归纳算法在将护理数据和信息转化为护理知识过程中的恰当应用。