Li Han-Lin, Chen Ming-Hsien
Institute of Information Management, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu 300, Taiwan, Republic of China.
Comput Biol Med. 2008 Jan;38(1):42-52. doi: 10.1016/j.compbiomed.2007.07.006. Epub 2007 Sep 14.
To induce critical classification rules from observed data is a major task in biological and medical research. A classification rule is considered to be useful if it is optimal and simultaneously satisfies three criteria: is highly accurate, has a high rate of support, and is highly compact. However, current classification methods, such as rough set theory, neural networks, ID3, etc., may only induce feasible rules instead of optimal rules. In addition, the rules found by current methods may only satisfy one of the three criteria. This study proposes a multi-criteria model to induce optimal classification rules with better rates of accuracy, support and compactness. A linear multi-objective programming model for inducing classification rules is formulated. Two practical data sets, one of HSV patients results and another of European barn swallows, are tested. The results illustrate that the proposed method can induce better rules than current methods.
从观测数据中归纳出关键分类规则是生物学和医学研究中的一项主要任务。如果一个分类规则是最优的且同时满足三个标准,那么它就被认为是有用的:高度准确、支持率高且高度紧凑。然而,当前的分类方法,如粗糙集理论、神经网络、ID3等,可能只能归纳出可行规则而非最优规则。此外,当前方法所发现的规则可能只满足这三个标准中的一个。本研究提出了一个多标准模型,以归纳出具有更好的准确率、支持率和紧凑性的最优分类规则。构建了一个用于归纳分类规则的线性多目标规划模型。对两个实际数据集进行了测试,一个是HSV患者的结果数据集,另一个是欧洲家燕的数据集。结果表明,所提出的方法能够归纳出比当前方法更好的规则。