Suppr超能文献

医学数据知识发现中预处理的系统图谱。

A systematic map of medical data preprocessing in knowledge discovery.

机构信息

Software Project Management Research Team, ENSIAS, University Mohammed V of Rabat, Morocco.

Department of Informatics and Systems, Faculty of Computer Science, University of Murcia, Spain.

出版信息

Comput Methods Programs Biomed. 2018 Aug;162:69-85. doi: 10.1016/j.cmpb.2018.05.007. Epub 2018 May 5.

Abstract

BACKGROUND AND OBJECTIVE

Datamining (DM) has, over the last decade, received increased attention in the medical domain and has been widely used to analyze medical datasets in order to extract useful knowledge and previously unknown patterns. However, historical medical data can often comprise inconsistent, noisy, imbalanced, missing and high dimensional data. These challenges lead to a serious bias in predictive modeling and reduce the performance of DM techniques. Data preprocessing is, therefore, an essential step in knowledge discovery as regards improving the quality of data and making it appropriate and suitable for DM techniques. The objective of this paper is to review the use of preprocessing techniques in clinical datasets.

METHODS

We performed a systematic map of studies regarding the application of data preprocessing to healthcare and published between January 2000 and December 2017. A search string was determined on the basis of the mapping questions and the PICO categories. The search string was then applied in digital databases covering the fields of computer science and medical informatics in order to identify relevant studies. The studies were initially selected by reading their titles, abstracts and keywords. Those that were selected at that stage were then reviewed using a set of inclusion and exclusion criteria in order to eliminate any that were not relevant. This process resulted in 126 primary studies.

RESULTS

Selected studies were analyzed and classified according to their publication years and channels, research type, empirical type and contribution type. The findings of this mapping study revealed that researchers have paid a considerable amount of attention to preprocessing in medical DM in last decade. A significant number of the selected studies used data reduction and cleaning preprocessing tasks. Moreover, the disciplines in which preprocessing have received most attention are: cardiology, endocrinology and oncology.

CONCLUSIONS

Researchers should develop and implement standards for an effective integration of multiple medical data types. Moreover, we identified the need to perform literature reviews.

摘要

背景与目的

在过去十年中,数据挖掘(DM)在医学领域受到了越来越多的关注,并被广泛用于分析医疗数据集,以提取有用的知识和以前未知的模式。然而,历史医疗数据通常可能包含不一致、嘈杂、不平衡、缺失和高维数据。这些挑战导致预测模型严重偏向,降低了 DM 技术的性能。因此,数据预处理是知识发现的一个重要步骤,可以提高数据的质量,并使其适合 DM 技术。本文的目的是回顾预处理技术在临床数据集中的应用。

方法

我们对 2000 年 1 月至 2017 年 12 月期间发表的关于将数据预处理应用于医疗保健的研究进行了系统的图谱绘制。基于映射问题和 PICO 类别确定了搜索字符串。然后,该搜索字符串被应用于涵盖计算机科学和医学信息学领域的数字数据库,以识别相关研究。通过阅读标题、摘要和关键字初步选择研究。然后,使用一套包括和排除标准对这些研究进行审查,以排除不相关的研究。这一过程产生了 126 项主要研究。

结果

所选研究根据其出版年份和渠道、研究类型、实证类型和贡献类型进行了分析和分类。这项映射研究的结果表明,研究人员在过去十年中对医学 DM 中的预处理给予了相当大的关注。相当多的选定研究使用了数据减少和清理预处理任务。此外,预处理受到关注最多的学科是:心脏病学、内分泌学和肿瘤学。

结论

研究人员应制定和实施有效整合多种医疗数据类型的标准。此外,我们还发现需要进行文献综述。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验