Wang LiMin, Cao FangYuan, Wang ShuangCheng, Sun MingHui, Dong LiYan
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, ChangChun City 130012, China.
Lixin Accounting Research Institute, Shanghai Lixin University of Commerce, Shanghai City 201620, China.
PLoS One. 2017 Aug 17;12(8):e0182070. doi: 10.1371/journal.pone.0182070. eCollection 2017.
Numerous data mining models have been proposed to construct computer-aided medical expert systems. Bayesian network classifiers (BNCs) are more distinct and understandable than other models. To graphically describe the dependency relationships among clinical variables for thyroid disease diagnosis and ensure the rationality of the diagnosis results, the proposed k-dependence causal forest (KCF) model generates a series of submodels in the framework of maximum spanning tree (MST) and demonstrates stronger dependence representation. Friedman test on 12 UCI datasets shows that KCF has classification accuracy advantage over the other state-of-the-art BNCs, such as Naive Bayes, tree augmented Naive Bayes, and k-dependence Bayesian classifier. Our extensive experimental comparison on 4 medical datasets also proves the feasibility and effectiveness of KCF in terms of sensitivity and specificity.
已经提出了许多数据挖掘模型来构建计算机辅助医学专家系统。贝叶斯网络分类器(BNC)比其他模型更具独特性和可理解性。为了以图形方式描述甲状腺疾病诊断中临床变量之间的依赖关系,并确保诊断结果的合理性,所提出的k-依赖因果森林(KCF)模型在最大生成树(MST)框架内生成一系列子模型,并表现出更强的依赖关系表示。对12个UCI数据集进行的Friedman检验表明,KCF在分类准确性方面优于其他先进的BNC,如朴素贝叶斯、树增强朴素贝叶斯和k-依赖贝叶斯分类器。我们在4个医学数据集上进行的广泛实验比较也证明了KCF在敏感性和特异性方面的可行性和有效性。