School of Computing and Information Engineering, University of Ulster, Coleraine BT52 1SA, Northern Ireland, United Kingdom.
Artif Intell Med. 2011 Jun;52(2):59-66. doi: 10.1016/j.artmed.2011.04.007. Epub 2011 May 20.
Balancing the trade-offs between solution quality, problem-solving efficiency, and transparency is an important challenge in medical applications of conversational case-based reasoning (CCBR). For example, test selection in CCBR is often based on strategies in which the absence of a specific hypothesis (e.g., diagnosis) to be confirmed makes it difficult to explain the relevance of test results that users are asked to provide. In this paper, we present an approach to CCBR in medical classification and diagnosis that aims to increase transparency while also providing high levels of accuracy and efficiency.
We present an algorithm for CCBR called iNN(k) in which feature selection is driven by the goal of confirming a target class and informed by a measure of a feature's discriminating power in favor of the target class. As we demonstrate in a CCBR system called CBR-Confirm, this enables a CCBR system to explain the relevance of any question it asks the user. We evaluate the algorithm's accuracy and efficiency on a selection of datasets related to medicine and health care.
The performance of iNN(k) on a given dataset is shown to depend on the value of k and on whether local or global feature selection is used in the algorithm. The combination of these parameters for which iNN(k) is most effective in addressing the trade-off between accuracy and efficiency is identified for each of the selected datasets. For example, only 42% and 51% on average of features in a complete problem description were needed by iNN(k) to provide accuracy levels of 86.5% and 84.3% respectively on the lymphography and SPECT heart datasets from the UCI machine learning repository.
Our results demonstrate the ability of iNN(k) to provide high levels of accuracy on most of the selected datasets, while often requiring the user to provide only a small subset of the features in a complete problem description, and enabling a CCBR system to explain the relevance of any question it asks the user.
在医疗对话案例推理(CCBR)的应用中,平衡解决方案质量、问题解决效率和透明度之间的权衡是一个重要的挑战。例如,CCBR 中的测试选择通常基于策略,即由于缺乏要确认的特定假设(例如诊断),因此难以解释用户被要求提供的测试结果的相关性。在本文中,我们提出了一种用于医疗分类和诊断的 CCBR 方法,旨在提高透明度,同时提供高精度和高效率。
我们提出了一种名为 iNN(k)的 CCBR 算法,其中特征选择由确认目标类别的目标驱动,并由特征对目标类别的区分能力的度量来告知。正如我们在名为 CBR-Confirm 的 CCBR 系统中所展示的那样,这使 CCBR 系统能够解释它向用户提出的任何问题的相关性。我们在与医学和医疗保健相关的一系列数据集上评估了该算法的准确性和效率。
iNN(k)在给定数据集上的性能取决于 k 的值以及算法中是否使用局部或全局特征选择。为每个选定的数据集确定了 iNN(k) 最有效地解决准确性和效率之间权衡的这些参数的组合。例如,在 UCI 机器学习存储库中的淋巴造影和 SPECT 心脏数据集上,iNN(k) 平均仅需要完整问题描述中特征的 42%和 51%,即可分别提供 86.5%和 84.3%的准确性水平。
我们的结果表明,iNN(k) 能够在大多数选定的数据集上提供高水平的准确性,而通常只需要用户在完整的问题描述中提供特征的一小部分子集,并使 CCBR 系统能够解释它向用户提出的任何问题的相关性。