Xu Jiucheng, Wang Yun, Xu Keqiang, Zhang Tianli
College of Computer and Information Engineering, Henan Normal University, Xinxiang, Henan, China.
Engineering Technology Research Center for Computing Intelligence and Data Mining, Xinxiang, Henan, China.
Comput Math Methods Med. 2019 Jan 27;2019:6705648. doi: 10.1155/2019/6705648. eCollection 2019.
To select more effective feature genes, many existing algorithms focus on the selection and study of evaluation methods for feature genes, ignoring the accurate mapping of original information in data processing. Therefore, for solving this problem, a new model is proposed in this paper: rough uncertainty metric model. First, the fuzzy neighborhood granule of the sample is constructed by combining the fuzzy similarity relation with the neighborhood radius in the rough set, and the rough decision is defined by using the fuzzy similarity relation and the decision equivalence class. Then, the fuzzy neighborhood granule and the rough decision are introduced into the conditional entropy, and the rough uncertainty metric model is proposed; in the meantime, the definition of measuring the significance of feature genes and the proof of some related theorems are given. To make this model tolerate noises in data, this paper introduces a variable precision model and discusses the selection of parameters. Finally, based on the rough uncertainty metric model, we design a feature genes selection algorithm and compare it with some existing similar algorithms. The experimental results show that the proposed algorithm can select the smaller feature genes subset with higher classification accuracy and verify that the model proposed in this paper is more effective.
为了选择更有效的特征基因,许多现有算法专注于特征基因评估方法的选择和研究,而忽略了数据处理中原始信息的精确映射。因此,为了解决这个问题,本文提出了一种新模型:粗糙不确定性度量模型。首先,通过将模糊相似关系与粗糙集中的邻域半径相结合来构建样本的模糊邻域粒,并利用模糊相似关系和决策等价类来定义粗糙决策。然后,将模糊邻域粒和粗糙决策引入条件熵,提出粗糙不确定性度量模型;同时,给出了特征基因重要性度量的定义以及一些相关定理的证明。为使该模型能够容忍数据中的噪声,本文引入了可变精度模型并讨论了参数选择。最后,基于粗糙不确定性度量模型,设计了一种特征基因选择算法,并将其与一些现有的类似算法进行比较。实验结果表明,所提算法能够选择出较小的特征基因子集且具有较高的分类准确率,验证了本文提出的模型更有效。