Suppr超能文献

一种简单的转换无关的异常值定义方法。

A simple transformation independent method for outlier definition.

机构信息

Unit of Clinical Biostatistics, Aalborg University Hospital, Aalborg, Denmark.

Department of Clinical Biochemistry, Aalborg University Hospital, Hobrovej 18-22, 9000 Aalborg, Denmark, Phone: +45 97649000.

出版信息

Clin Chem Lab Med. 2018 Aug 28;56(9):1524-1532. doi: 10.1515/cclm-2018-0025.

Abstract

BACKGROUND

Definition and elimination of outliers is a key element for medical laboratories establishing or verifying reference intervals (RIs). Especially as inclusion of just a few outlying observations may seriously affect the determination of the reference limits. Many methods have been developed for definition of outliers. Several of these methods are developed for the normal distribution and often data require transformation before outlier elimination.

METHODS

We have developed a non-parametric transformation independent outlier definition. The new method relies on drawing reproducible histograms. This is done by using defined bin sizes above and below the median. The method is compared to the method recommended by CLSI/IFCC, which uses Box-Cox transformation (BCT) and Tukey's fences for outlier definition. The comparison is done on eight simulated distributions and an indirect clinical datasets.

RESULTS

The comparison on simulated distributions shows that without outliers added the recommended method in general defines fewer outliers. However, when outliers are added on one side the proposed method often produces better results. With outliers on both sides the methods are equally good. Furthermore, it is found that the presence of outliers affects the BCT, and subsequently affects the determined limits of current recommended methods. This is especially seen in skewed distributions. The proposed outlier definition reproduced current RI limits on clinical data containing outliers.

CONCLUSIONS

We find our simple transformation independent outlier detection method as good as or better than the currently recommended methods.

摘要

背景

定义和消除离群值是医学实验室建立或验证参考区间(RI)的关键要素。特别是,只要包含几个异常值,就可能严重影响参考限的确定。已经开发了许多用于定义离群值的方法。其中一些方法是针对正态分布开发的,通常在进行离群值消除之前需要对数据进行转换。

方法

我们开发了一种非参数转换独立的离群值定义方法。新方法依赖于可重复绘制的直方图。这是通过使用中位数上下定义的固定大小的箱来完成的。该方法与 CLSI/IFCC 推荐的方法进行了比较,后者使用 Box-Cox 变换(BCT)和 Tukey 的围栏来定义离群值。比较是在八个模拟分布和一个间接临床数据集上进行的。

结果

在模拟分布上的比较表明,在没有添加离群值的情况下,推荐的方法通常会定义更少的离群值。但是,当仅在一侧添加离群值时,所提出的方法通常会产生更好的结果。当两侧都存在离群值时,两种方法的效果相同。此外,还发现离群值的存在会影响 BCT,从而影响当前推荐方法确定的限。在偏态分布中尤其如此。所提出的离群值定义方法在包含离群值的临床数据上再现了当前的 RI 限。

结论

我们发现我们的简单非参数转换独立离群值检测方法与当前推荐的方法一样好或更好。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验