Department of Philosophy, University of Rome "La Sapienza", Rome, Italy.
Artif Intell Med. 2011 Jul;52(3):123-39. doi: 10.1016/j.artmed.2011.04.002. Epub 2011 May 28.
The aim of this paper is to study the feasibility and the performance of some classifier systems belonging to family of instance-based (IB) learning as second-opinion diagnostic tools and as tools for the knowledge extraction phase in the process of knowledge discovery in clinical databases.
We consider three clinical databases: one relating to the differential diagnosis of erythemato-squamous diseases, the second to the diagnosis of the onset of diabetes mellitus and the third dealing with a problem of diagnostic imaging in nuclear cardiology. We apply five IB classifiers to each database; two are based on exemplars, one is based on prototypes and two are hybrid. One of the latter classifiers is a new classifier introduced here and is called prototype exemplar learning classifier (PEL-C). We use cross-validation techniques to evaluate and compare the performances of several classifier systems as diagnostic tools, considering indexes such as accuracy, sensitivity, specificity, and conciseness of class representations. Moreover we analyze the number and the type of instances that represent the diagnostic classes learnt by each classifier to evaluate and compare their knowledge extraction capabilities.
An examination of the experimental results shows that classifiers with the best classification performances are the optimized k-nearest neighbour classifier (k-NNC) and PEL-C. The k-NNC uses the highest number of representative instances, 100% of the entire database, whereas PEL-C uses a far lesser number of representative instances: equal, on the average, to the 3% of the database. As tools for knowledge extraction, we interpret the kind of class representations obtained by IB classifiers as a form of nosological knowledge. Additionally, we report the most interesting diagnostic class representations to be those extracted by PEL-C because they are composed of a mixture of abstracted prototypical cases (syndromes) and selected atypical clinical cases.
This study shows that IB methods - most notably, the optimized k-NNC and the PEL-C - can be used and may be advantageous for clinical decision support systems and that IB classifiers can be used for nosological knowledge extraction. Because PEL-C uses more compact and potentially meaningful class descriptions, it is preferable when the diagnostic problem at-hand needs smaller storage space or for knowledge extraction itself. The complexity and responsibility of diagnostic practice requires that these results be confirmed further within other clinical domains.
本文旨在研究几种基于实例(IB)学习的分类器系统作为辅助诊断工具的可行性和性能,以及作为从临床数据库中发现知识的知识提取阶段的工具。
我们考虑了三个临床数据库:一个与红斑鳞屑性疾病的鉴别诊断有关,另一个与糖尿病发病的诊断有关,第三个与核心脏病学的诊断成像问题有关。我们将五种 IB 分类器应用于每个数据库;其中两个基于示例,一个基于原型,两个是混合的。其中一个混合分类器是这里引入的新分类器,称为原型示例学习分类器(PEL-C)。我们使用交叉验证技术来评估和比较几种分类器系统作为诊断工具的性能,考虑的指标包括准确性、敏感性、特异性和类表示的简洁性。此外,我们分析了每个分类器学习的诊断类表示的实例数量和类型,以评估和比较它们的知识提取能力。
对实验结果的分析表明,分类性能最好的分类器是优化的 K-最近邻分类器(k-NNC)和 PEL-C。k-NNC 使用了最多的代表实例,占整个数据库的 100%,而 PEL-C 使用的代表实例要少得多:平均相当于数据库的 3%。作为知识提取工具,我们将 IB 分类器获得的类表示解释为一种分类学知识的形式。此外,我们报告说,最有趣的诊断类表示是由 PEL-C 提取的,因为它们由混合的抽象原型病例(综合征)和选择的非典型临床病例组成。
本研究表明,IB 方法 - 尤其是优化的 k-NNC 和 PEL-C - 可用于临床决策支持系统,并可用于提取分类学知识。由于 PEL-C 使用更紧凑和潜在有意义的类描述,因此在当前诊断问题需要较小的存储空间或用于知识提取本身时,它更可取。诊断实践的复杂性和责任要求在其他临床领域进一步确认这些结果。