Suppr超能文献

临床决策支持系统:从小规模和不均衡数据集的视角来看

Clinical Decision Support Systems: From the Perspective of Small and Imbalanced Data Set.

作者信息

Par Oznur Esra, Akcapinar Sezer Ebru, Sever Hayri

机构信息

Turkish Aerospace.

Hacettepe University.

出版信息

Stud Health Technol Inform. 2019 Jul 4;262:344-347. doi: 10.3233/SHTI190089.

Abstract

Clinical decision support systems are data analysis software that supports health professionals' decision - making the process to reach their ultimate outcome, taking into account patient information. However, the need for decision support systems cannot be denied because of most activities in the field of health care within the decision-making process. Decision support systems used for diagnosis are designed based on disease due to the complexity of diseases, symptoms, and disease-symptoms relationships. In the design and implementation of clinical decision support systems, mathematical modeling, pattern recognition and statistical analysis techniques of large databases and data mining techniques such as classification are also widely used. Classification of data is difficult in case of the small and/or imbalanced data set and this problem directly affects the classification performance. Small and/or imbalance dataset has become a major problem in data mining because classification algorithms are developed based on the assumption that the data sets are balanced and large enough. Most of the algorithms ignore or misclassify examples of the minority class, focus on the majority class. Most health data are small and imbalanced by nature. Learning from imbalanced and small data sets is an important and unsettled problem. Within the scope of the study, the publicly accessible data set, hepatitis was oversampled by distance-based data generation methods. The oversampled data sets were classified by using four different machine learning algorithms. Considering the classification scores of four different machine learning algorithms (Artificial Neural Networks, Support Vector Machines, Naive Bayes and Decision Tree), optimal synthetic data generation rate is recommended.

摘要

临床决策支持系统是一种数据分析软件,它支持医疗专业人员在考虑患者信息的情况下进行决策,以实现最终结果。然而,由于医疗保健领域决策过程中的大多数活动,决策支持系统的需求是不可否认的。由于疾病、症状以及疾病与症状关系的复杂性,用于诊断的决策支持系统是基于疾病设计的。在临床决策支持系统的设计和实施中,数学建模、模式识别、大型数据库的统计分析技术以及诸如分类等数据挖掘技术也被广泛使用。在数据集小和/或不平衡的情况下,数据分类很困难,这个问题直接影响分类性能。小和/或不平衡数据集已成为数据挖掘中的一个主要问题,因为分类算法是基于数据集平衡且足够大的假设开发的。大多数算法忽略少数类的示例或对其进行错误分类,而专注于多数类。大多数健康数据本质上都是小的且不平衡的。从不平衡和小的数据集中学习是一个重要且未解决的问题。在该研究范围内,通过基于距离的数据生成方法对公开可用的肝炎数据集进行了过采样。使用四种不同的机器学习算法对过采样后的数据集进行分类。考虑四种不同机器学习算法(人工神经网络、支持向量机、朴素贝叶斯和决策树)的分类得分,推荐了最佳合成数据生成率。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验