Suppr超能文献

用于心脏病分类的数据预处理:一项系统的文献综述。

Data preprocessing for heart disease classification: A systematic literature review.

作者信息

Benhar H, Idri A, Fernández-Alemán J L

机构信息

Software Project Management Research Team, ENSIAS, University Mohammed V in Rabat, Morocco.

Software Project Management Research Team, ENSIAS, University Mohammed V in Rabat, Morocco; CSEHS-MSDA, Mohammed VI Polytechnic University, Benguerir, Morocco.

出版信息

Comput Methods Programs Biomed. 2020 Oct;195:105635. doi: 10.1016/j.cmpb.2020.105635. Epub 2020 Jul 3.

Abstract

CONTEXT

Early detection of heart disease is an important challenge since 17.3 million people yearly lose their lives due to heart diseases. Besides, any error in diagnosis of cardiac disease can be dangerous and risks an individual's life. Accurate diagnosis is therefore critical in cardiology. Data Mining (DM) classification techniques have been used to diagnosis heart diseases but still limited by some challenges of data quality such as inconsistencies, noise, missing data, outliers, high dimensionality and imbalanced data. Data preprocessing (DP) techniques were therefore used to prepare data with the goal of improving the performance of heart disease DM based prediction systems.

OBJECTIVE

The purpose of this study is to review and summarize the current evidence on the use of preprocessing techniques in heart disease classification as regards: (1) the DP tasks and techniques most frequently used, (2) the impact of DP tasks and techniques on the performance of classification in cardiology, (3) the overall performance of classifiers when using DP techniques, and (4) comparisons of different combinations classifier-preprocessing in terms of accuracy rate.

METHOD

A systematic literature review is carried out, by identifying and analyzing empirical studies on the application of data preprocessing in heart disease classification published in the period between January 2000 and June 2019. A total of 49 studies were therefore selected and analyzed according to the aforementioned criteria.

RESULTS

The review results show that data reduction is the most used preprocessing task in cardiology, followed by data cleaning. In general, preprocessing either maintained or improved the performance of heart disease classifiers. Some combinations such as (ANN + PCA), (ANN + CHI) and (SVM + PCA) are promising terms of accuracy. However the deployment of these models in real-world diagnosis decision support systems is subject to several risks and limitations due to the lack of interpretation.

摘要

背景

由于每年有1730万人死于心脏病,因此心脏病的早期检测是一项重大挑战。此外,心脏病诊断中的任何错误都可能很危险,并危及个人生命。因此,准确诊断在心脏病学中至关重要。数据挖掘(DM)分类技术已被用于心脏病诊断,但仍受到数据质量的一些挑战的限制,如不一致性、噪声、缺失数据、异常值、高维度和数据不平衡。因此,使用数据预处理(DP)技术来准备数据,目的是提高基于心脏病DM的预测系统的性能。

目的

本研究的目的是回顾和总结关于预处理技术在心脏病分类中的应用的当前证据,涉及:(1)最常用的DP任务和技术,(2)DP任务和技术对心脏病学分类性能的影响,(3)使用DP技术时分类器的整体性能,以及(4)不同分类器 - 预处理组合在准确率方面的比较。

方法

通过识别和分析2000年1月至2019年6月期间发表的关于数据预处理在心脏病分类中的应用的实证研究,进行了系统的文献综述。因此,根据上述标准共选择并分析了49项研究。

结果

综述结果表明,数据约简是心脏病学中最常用的预处理任务,其次是数据清理。一般来说,预处理要么维持要么提高了心脏病分类器的性能。一些组合,如(人工神经网络 + 主成分分析)、(人工神经网络 + 卡方检验)和(支持向量机 + 主成分分析)在准确率方面很有前景。然而,由于缺乏可解释性,这些模型在实际诊断决策支持系统中的部署存在若干风险和局限性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验