• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 Pawlak 冲突模型和决策树的独立数据源新分类方法

New Classification Method for Independent Data Sources Using Pawlak Conflict Model and Decision Trees.

作者信息

Przybyła-Kasperek Małgorzata, Kusztal Katarzyna

机构信息

Institute of Computer Science, University of Silesia in Katowice, Bȩdzińska 39, 41-200 Sosnowiec, Poland.

出版信息

Entropy (Basel). 2022 Nov 4;24(11):1604. doi: 10.3390/e24111604.

DOI:10.3390/e24111604
PMID:36359694
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9689716/
Abstract

The research concerns data collected in independent sets-more specifically, in local decision tables. A possible approach to managing these data is to build local classifiers based on each table individually. In the literature, many approaches toward combining the final prediction results of independent classifiers can be found, but insufficient efforts have been made on the study of tables' cooperation and coalitions' formation. The importance of such an approach was expected on two levels. First, the impact on the quality of classification-the ability to build combined classifiers for coalitions of tables should allow for the learning of more generalized concepts. In turn, this should have an impact on the quality of classification of new objects. Second, combining tables into coalitions will result in reduced computational complexity-a reduced number of classifiers will be built. The paper proposes a new method for creating coalitions of local tables and generating an aggregated classifier for each coalition. Coalitions are generated by determining certain characteristics of attribute values occurring in local tables and applying the Pawlak conflict analysis model. In the study, the classification and regression trees with Gini index are built based on the aggregated table for one coalition. The system bears a hierarchical structure, as in the next stage the decisions generated by the classifiers for coalitions are aggregated using majority voting. The classification quality of the proposed system was compared with an approach that does not use local data cooperation and coalition creation. The structure of the system is parallel and decision trees are built independently for local tables. In the paper, it was shown that the proposed approach provides a significant improvement in classification quality and execution time. The Wilcoxon test confirmed that differences in accuracy rate of the results obtained for the proposed method and results obtained without coalitions are significant, with a level = 0.005. The average accuracy rate values obtained for the proposed approach and the approach without coalitions are, respectively: 0.847 and 0.812; so the difference is quite large. Moreover, the algorithm implementing the proposed approach performed up to 21-times faster than the algorithm implementing the approach without using coalitions.

摘要

该研究涉及在独立集合中收集的数据,更具体地说,是在局部决策表中收集的数据。管理这些数据的一种可能方法是基于每个表单独构建局部分类器。在文献中,可以找到许多用于组合独立分类器最终预测结果的方法,但在表的协作和联盟形成的研究方面投入的精力不足。这种方法的重要性体现在两个层面。首先,对分类质量的影响——为表的联盟构建组合分类器的能力应有助于学习更通用的概念。相应地,这应对新对象的分类质量产生影响。其次,将表组合成联盟将降低计算复杂度——构建的分类器数量将减少。本文提出了一种创建局部表联盟并为每个联盟生成聚合分类器的新方法。联盟是通过确定局部表中出现的属性值的某些特征并应用帕夫拉克冲突分析模型来生成的。在该研究中,基于一个联盟的聚合表构建了具有基尼指数的分类和回归树。该系统具有层次结构,因为在下一阶段,联盟分类器生成的决策使用多数投票进行聚合。将所提出系统的分类质量与不使用局部数据协作和联盟创建的方法进行了比较。该系统的结构是并行的,并且为局部表独立构建决策树。本文表明,所提出的方法在分类质量和执行时间方面有显著提高。威尔科克森检验证实,所提出方法获得的结果与无联盟方法获得的结果在准确率上的差异是显著的,显著性水平 = 0.005。所提出方法和无联盟方法获得的平均准确率值分别为:0.847 和 0.812;因此差异相当大。此外,实现所提出方法的算法比实现不使用联盟方法的算法执行速度快达 21 倍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/c692850aa529/entropy-24-01604-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/8647804d9fc9/entropy-24-01604-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/afa1bb57f8ab/entropy-24-01604-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/a767f31f9848/entropy-24-01604-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/d24a04bc7cce/entropy-24-01604-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/1d4f2819dae5/entropy-24-01604-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/b3b1ca01d8c8/entropy-24-01604-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/f01a9643ea14/entropy-24-01604-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/c692850aa529/entropy-24-01604-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/8647804d9fc9/entropy-24-01604-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/afa1bb57f8ab/entropy-24-01604-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/a767f31f9848/entropy-24-01604-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/d24a04bc7cce/entropy-24-01604-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/1d4f2819dae5/entropy-24-01604-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/b3b1ca01d8c8/entropy-24-01604-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/f01a9643ea14/entropy-24-01604-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3304/9689716/c692850aa529/entropy-24-01604-g008.jpg

相似文献

1
New Classification Method for Independent Data Sources Using Pawlak Conflict Model and Decision Trees.基于 Pawlak 冲突模型和决策树的独立数据源新分类方法
Entropy (Basel). 2022 Nov 4;24(11):1604. doi: 10.3390/e24111604.
2
Study on the Use of Artificially Generated Objects in the Process of Training MLP Neural Networks Based on Dispersed Data.基于离散数据训练多层感知器神经网络过程中人工生成对象的使用研究
Entropy (Basel). 2023 Apr 24;25(5):703. doi: 10.3390/e25050703.
3
Machine learning for improved pathological staging of prostate cancer: a performance comparison on a range of classifiers.机器学习在前列腺癌病理分期中的应用:一系列分类器的性能比较。
Artif Intell Med. 2012 May;55(1):25-35. doi: 10.1016/j.artmed.2011.11.003. Epub 2011 Dec 27.
4
Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.机器学习算法在(放化疗)治疗结果预测中的应用:分类器的实证比较。
Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.
5
Neural Network Used for the Fusion of Predictions Obtained by the K-Nearest Neighbors Algorithm Based on Independent Data Sources.用于融合基于独立数据源的K近邻算法所获预测结果的神经网络。
Entropy (Basel). 2021 Nov 25;23(12):1568. doi: 10.3390/e23121568.
6
Multiple Classifiers Based Semi-Supervised Polarimetric SAR Image Classification Method.基于多分类器的半监督极化 SAR 图像分类方法。
Sensors (Basel). 2021 Apr 25;21(9):3006. doi: 10.3390/s21093006.
7
Management and governance processes in community health coalitions: a procedural justice perspective.社区健康联盟中的管理与治理流程:程序正义视角
Health Educ Behav. 2002 Dec;29(6):737-54. doi: 10.1177/109019802237941.
8
The influence of community context on how coalitions achieve HIV-preventive structural change.社区环境对联盟实现预防艾滋病结构性变革方式的影响。
Health Educ Behav. 2014 Feb;41(1):100-7. doi: 10.1177/1090198113492766. Epub 2013 Jul 12.
9
Selected Data Mining Tools for Data Analysis in Distributed Environment.分布式环境下用于数据分析的选定数据挖掘工具。
Entropy (Basel). 2022 Oct 1;24(10):1401. doi: 10.3390/e24101401.
10
A decision tree--based method for the differential diagnosis of Aortic Stenosis from Mitral Regurgitation using heart sounds.一种基于决策树的利用心音对主动脉瓣狭窄与二尖瓣反流进行鉴别诊断的方法。
Biomed Eng Online. 2004 Jun 29;3(1):21. doi: 10.1186/1475-925X-3-21.

引用本文的文献

1
A multi-layer perceptron neural network for varied conditional attributes in tabular dispersed data.用于表格离散数据中各种条件属性的多层感知器神经网络。
PLoS One. 2024 Dec 2;19(12):e0311041. doi: 10.1371/journal.pone.0311041. eCollection 2024.
2
Proximal humeral bone density assessment and prediction analysis using machine learning techniques: An innovative approach in medical research.使用机器学习技术进行肱骨近端骨密度评估和预测分析:医学研究中的一种创新方法。
Heliyon. 2024 Jul 31;10(15):e35451. doi: 10.1016/j.heliyon.2024.e35451. eCollection 2024 Aug 15.

本文引用的文献

1
Coding for Large-Scale Distributed Machine Learning.大规模分布式机器学习的编码
Entropy (Basel). 2022 Sep 12;24(9):1284. doi: 10.3390/e24091284.
2
A Pragmatic Ensemble Strategy for Missing Values Imputation in Health Records.一种用于健康记录中缺失值插补的实用集成策略。
Entropy (Basel). 2022 Apr 10;24(4):533. doi: 10.3390/e24040533.