Suppr超能文献

截断异常值滤波

Truncated outlier filtering.

作者信息

Costa Peter J

机构信息

a Hologic, Incorporated , Marlborough , Massachusetts , USA.

出版信息

J Biopharm Stat. 2014;24(5):1115-29. doi: 10.1080/10543406.2014.926366.

Abstract

The statistical analysis of data can be heavily influenced by measurements of extreme value. If such measurements are contained in the remote tail ends of the true population distribution from which they are drawn, they are referred to as outliers. Neglecting to filter outliers from a sample can distort statistical computations and result in faulty conclusions. Conventional techniques identify measurements, whose distances from the mean exceed a selected multiple of the sample standard deviation, as outliers. Such approaches, however, can fail to classify measurements with large normalized distances as outliers. The truncated outlier filtering method first replaces the minimum and maximum of the population before computing the exclusion criterion. This mitigates the influence of abnormally large (or small) measurements on the normalized distance and hence yields a more compact criterion for outlier determination. Moreover, the method generalizes to two or more dimensions. Simulated one-dimensional and multidimensional data are analyzed. A discussion of the results is also presented.

摘要

数据的统计分析可能会受到极值测量的严重影响。如果这些测量值包含在从中抽取它们的真实总体分布的远端尾部,则它们被称为异常值。忽略从样本中过滤异常值会扭曲统计计算并导致错误的结论。传统技术将那些与均值的距离超过样本标准差选定倍数的测量值识别为异常值。然而,这种方法可能无法将具有大归一化距离的测量值分类为异常值。截断异常值过滤方法首先在计算排除标准之前替换总体的最小值和最大值。这减轻了异常大(或小)测量值对归一化距离的影响,从而产生了一个更紧凑的异常值判定标准。此外,该方法可以推广到二维或更多维度。对模拟的一维和多维数据进行了分析。还对结果进行了讨论。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验