Suppr超能文献

统计数据准备:缺失值与异常值的管理

Statistical data preparation: management of missing values and outliers.

作者信息

Kwak Sang Kyu, Kim Jong Hae

机构信息

Department of Medical Statistics, School of Medicine, Catholic University of Daegu, Daegu, Korea.

Department of Anesthesiology and Pain Medicine, School of Medicine, Catholic University of Daegu, Daegu, Korea.

出版信息

Korean J Anesthesiol. 2017 Aug;70(4):407-411. doi: 10.4097/kjae.2017.70.4.407. Epub 2017 Jul 27.

Abstract

Missing values and outliers are frequently encountered while collecting data. The presence of missing values reduces the data available to be analyzed, compromising the statistical power of the study, and eventually the reliability of its results. In addition, it causes a significant bias in the results and degrades the efficiency of the data. Outliers significantly affect the process of estimating statistics (, the average and standard deviation of a sample), resulting in overestimated or underestimated values. Therefore, the results of data analysis are considerably dependent on the ways in which the missing values and outliers are processed. In this regard, this review discusses the types of missing values, ways of identifying outliers, and dealing with the two.

摘要

在收集数据时经常会遇到缺失值和异常值。缺失值的存在减少了可用于分析的数据量,损害了研究的统计效力,并最终影响其结果的可靠性。此外,它还会导致结果出现显著偏差,并降低数据的效率。异常值会显著影响统计估计过程(例如样本的均值和标准差),导致值被高估或低估。因此,数据分析的结果在很大程度上取决于处理缺失值和异常值的方式。在这方面,本综述讨论了缺失值的类型、识别异常值的方法以及处理这两者的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e63/5548942/da2f99db2b0c/kjae-70-407-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验