Institute of Information Management, National Chiao Tung University, Management Building 2, 1001 Ta-Hsueh Road, Hsinchu 300, Taiwan.
Comput Biol Med. 2011 Aug;41(8):587-99. doi: 10.1016/j.compbiomed.2011.05.002. Epub 2011 Jul 13.
Identifying the classification rules for patients, based on a given dataset, is an important role in medical tasks. For example, the rules for estimating the likelihood of survival for patients undergoing breast cancer surgery are critical in treatment planning. Many well-known classification methods (as decision tree methods and hyper-plane methods) assume that classes can be separated by a linear function. However, these methods suffer when the boundaries between the classes are non-linear. This study presents a novel method, called DIAMOND, to induce classification rules from datasets containing non-linear interactions between the input data and the classes to be predicted. Given a set of objects with some classes, DIAMOND separates the objects into different cubes, and assigns each cube to a class. Via the unions of these cubes, DIAMOND uses mixed-integer programs to induce classification rules with better rates of accuracy, support and compact. This study uses three practical datasets (Iris flower, HSV patients, and breast cancer patients) to illustrate the advantages of DIAMOND over some current methods.
基于给定的数据集识别患者的分类规则是医学任务中的一个重要角色。例如,估计接受乳腺癌手术的患者生存可能性的规则在治疗计划中至关重要。许多著名的分类方法(如决策树方法和超平面方法)假设类可以通过线性函数来区分。然而,当类之间的边界是非线性时,这些方法就会失效。本研究提出了一种称为 DIAMOND 的新方法,用于从包含输入数据与预测类之间的非线性交互的数据集诱导分类规则。给定一组具有某些类别的对象,DIAMOND 将对象分为不同的多维数据集,并将每个多维数据集分配给一个类。通过这些多维数据集的并集,DIAMOND 使用混合整数程序来诱导具有更高准确性、支持度和紧凑度的分类规则。本研究使用三个实际数据集(鸢尾花、HSV 患者和乳腺癌患者)来说明 DIAMOND 相对于一些现有方法的优势。