Suppr超能文献

忽略诱导中的值。

Don't care values in induction.

作者信息

Diamantidis N, Giakoumakis E A

机构信息

Informatics Department, Athens University of Economics and Business, Greece.

出版信息

Artif Intell Med. 1996 Oct;8(5):505-14. doi: 10.1016/S0933-3657(96)00357-0.

Abstract

Inductive learning algorithms are powerful tools for the extraction of knowledge from data. Their success in medical domains is well-known. In medical diagnosis domains and generally in real-world applications among other problems, inductive learning algorithms have to deal with unknown values. In most cases unknown values are treated as missing ones, i.e. unknown values which are related to the class of training examples, but are missing due to lack of measurements. In this paper we address the problem of don't care values, which are unknown, because they are irrelevant to the class of the examples. The distinction of don't care values and missing ones is important in medical domains. With this distinction the experts are able to relate each diagnosis to the appropriate subset of attributes. We present techniques for dealing efficiently with don't care values in the induction of decision trees. Furthermore, we examine the importance of the distinction between missing and don't care values and we investigate the existence of don't care values instead of missing ones, in medical and non-medical real-world datasets.

摘要

归纳学习算法是从数据中提取知识的强大工具。它们在医学领域的成功是众所周知的。在医学诊断领域以及一般在实际应用中,除了其他问题外,归纳学习算法还必须处理未知值。在大多数情况下,未知值被视为缺失值,即与训练示例类别相关但由于缺乏测量而缺失的未知值。在本文中,我们解决了无关值的问题,这些值是未知的,因为它们与示例类别无关。在医学领域,区分无关值和缺失值很重要。通过这种区分,专家能够将每个诊断与适当的属性子集相关联。我们提出了在决策树归纳中有效处理无关值的技术。此外,我们研究了区分缺失值和无关值的重要性,并调查了在医学和非医学实际数据集里存在无关值而非缺失值的情况。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验