• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

参考分布中异常值的检测:霍恩算法的性能

Detection of outliers in reference distributions: performance of Horn's algorithm.

作者信息

Solberg Helge Erik, Lahti Ari

机构信息

Department of Medical Biochemistry, Rikshospitalet-Radiumhospitalet HF, Oslo, Norway.

出版信息

Clin Chem. 2005 Dec;51(12):2326-32. doi: 10.1373/clinchem.2005.058339. Epub 2005 Oct 13.

DOI:10.1373/clinchem.2005.058339
PMID:16223885
Abstract

BACKGROUND

Medical laboratory reference data may be contaminated with outliers that should be eliminated before estimation of the reference interval. A statistical test for outliers has been proposed by Paul S. Horn and coworkers (Clin Chem 2001;47:2137-45). The algorithm operates in 2 steps: (a) mathematically transform the original data to approximate a gaussian distribution; and (b) establish detection limits (Tukey fences) based on the central part of the transformed distribution.

METHODS

We studied the specificity of Horn's test algorithm (probability of false detection of outliers), using Monte Carlo computer simulations performed on 13 types of probability distributions covering a wide range of positive and negative skewness. Distributions with 3% of the original observations replaced by random outliers were used to also examine the sensitivity of the test (probability of detection of true outliers). Three data transformations were used: the Box and Cox function (used in the original Horn's test), the Manly exponential function, and the John and Draper modulus function.

RESULTS

For many of the probability distributions, the specificity of Horn's algorithm was rather poor compared with the theoretical expectation. The cause for such poor performance was at least partially related to remaining nongaussian kurtosis (peakedness). The sensitivity showed great variation, dependent on both the type of underlying distribution and the location of the outliers (upper and/or lower tail).

CONCLUSION

Although Horn's algorithm undoubtedly is an improvement compared with older methods for outlier detection, reliable statistical identification of outliers in reference data remains a challenge.

摘要

背景

医学实验室参考数据可能会受到异常值的污染,在估计参考区间之前应将其剔除。Paul S. Horn及其同事提出了一种用于检测异常值的统计检验方法(《临床化学》2001年;47:2137 - 45)。该算法分两步运行:(a) 对原始数据进行数学变换,使其近似高斯分布;(b) 根据变换后分布的中心部分确定检测限(Tukey界限)。

方法

我们使用蒙特卡罗计算机模拟研究了Horn检验算法的特异性(误检异常值的概率),模拟针对13种概率分布进行,涵盖了广泛的正负偏度范围。用3%的原始观测值被随机异常值替换后的分布来检验该检验的灵敏度(检测真实异常值的概率)。使用了三种数据变换:Box和Cox函数(用于原始的Horn检验)、Manly指数函数以及John和Draper模量函数。

结果

对于许多概率分布,与理论预期相比,Horn算法的特异性相当差。这种不佳表现的原因至少部分与剩余的非高斯峰度(尖峰性)有关。灵敏度表现出很大差异,这取决于基础分布的类型以及异常值的位置(上尾和/或下尾)。

结论

尽管与旧的异常值检测方法相比,Horn算法无疑是一种改进,但在参考数据中可靠地统计识别异常值仍然是一个挑战。

相似文献

1
Detection of outliers in reference distributions: performance of Horn's algorithm.参考分布中异常值的检测:霍恩算法的性能
Clin Chem. 2005 Dec;51(12):2326-32. doi: 10.1373/clinchem.2005.058339. Epub 2005 Oct 13.
2
STAR_outliers: a python package that separates univariate outliers from non-normal distributions.STAR异常值:一个用于从非正态分布中分离单变量异常值的Python包。
BioData Min. 2023 Sep 4;16(1):25. doi: 10.1186/s13040-023-00342-0.
3
High Horn's index score predicts poor outcomes in patients with Clostridium difficile infection.高角指数评分可预测艰难梭菌感染患者的不良结局。
J Hosp Infect. 2011 Sep;79(1):23-6. doi: 10.1016/j.jhin.2011.04.027. Epub 2011 Jun 22.
4
A Diagnostic Procedure for Detecting Outliers in Linear State-Space Models.一种用于检测线性状态空间模型中异常值的诊断程序。
Multivariate Behav Res. 2020 Mar-Apr;55(2):231-255. doi: 10.1080/00273171.2019.1627659. Epub 2019 Jul 2.
5
Comparing Methods for Measurement Error Detection in Serial 24-h Hormonal Data.比较串联 24 小时激素数据中测量误差检测的方法。
J Biol Rhythms. 2019 Aug;34(4):347-363. doi: 10.1177/0748730419850917. Epub 2019 Jun 12.
6
Exploring the Sensitivity of Horn's Parallel Analysis to the Distributional Form of Random Data.探究霍恩平行分析对随机数据分布形式的敏感性。
Multivariate Behav Res. 2009 May;44(3):362-388. doi: 10.1080/00273170902938969.
7
A simple transformation independent method for outlier definition.一种简单的转换无关的异常值定义方法。
Clin Chem Lab Med. 2018 Aug 28;56(9):1524-1532. doi: 10.1515/cclm-2018-0025.
8
Comparison of different approaches to evaluate External Quality Assessment Data.比较不同方法评估外部质量评估数据。
Clin Chim Acta. 2012 Mar 22;413(5-6):582-6. doi: 10.1016/j.cca.2011.11.030. Epub 2011 Dec 8.
9
Study on outlier detection method of the near infrared spectroscopy analysis by probability metric.基于概率测度的近红外光谱分析异常值检测方法研究。
Spectrochim Acta A Mol Biomol Spectrosc. 2022 Nov 5;280:121473. doi: 10.1016/j.saa.2022.121473. Epub 2022 Jun 6.
10
A Monte Carlo Metropolis-Hastings algorithm for sampling from distributions with intractable normalizing constants.一种用于从具有难以处理的归一化常数的分布中进行抽样的蒙特卡罗 metropolis-hastings 算法。
Neural Comput. 2013 Aug;25(8):2199-234. doi: 10.1162/NECO_a_00466. Epub 2013 Apr 22.

引用本文的文献

1
Exploring the Novelty in Lipid Profiling of Patients: A Non-fasting Approach from Eastern India.探索患者血脂谱的新特点:来自印度东部的非空腹检测方法
J Lab Physicians. 2022 Oct 20;15(1):90-96. doi: 10.1055/s-0042-1757410. eCollection 2023 Mar.
2
Cord Platelet Count of Full-Term Neonates in Relation to ABO Incompatibility and Glucose-6-Phosphate Dehydrogenase Levels: A Retrospective Cohort Study.足月儿脐带血小板计数与ABO血型不合及葡萄糖-6-磷酸脱氢酶水平的关系:一项回顾性队列研究。
Cureus. 2022 Oct 14;14(10):e30284. doi: 10.7759/cureus.30284. eCollection 2022 Oct.
3
Defining new reference intervals for serum free light chains in individuals with chronic kidney disease: Results of the iStopMM study.
定义慢性肾脏病患者血清游离轻链的新参考区间:iStopMM 研究结果。
Blood Cancer J. 2022 Sep 14;12(9):133. doi: 10.1038/s41408-022-00732-3.
4
Indirect estimation of reference intervals for thyroid parameters using advia centaur XP analyzer.使用Advia Centaur XP分析仪间接估计甲状腺参数的参考区间。
J Med Biochem. 2022 Apr 8;41(2):238-245. doi: 10.5937/jomb0-33543.
5
Indirect reference intervals using an R pipeline.使用R管道的间接参考区间。
J Mass Spectrom Adv Clin Lab. 2022 Feb 23;24:22-30. doi: 10.1016/j.jmsacl.2022.02.004. eCollection 2022 Apr.
6
Variation in biochemistry test results between annual wellness visits in apparently healthy Golden Retrievers.在貌似健康的金毛寻回犬年度健康检查中,生化检验结果的变化。
J Vet Intern Med. 2021 Mar;35(2):912-924. doi: 10.1111/jvim.16021. Epub 2021 Feb 2.
7
Establishment of Pediatric Reference Intervals for Routine Laboratory Tests in Korean Population: A Retrospective Multicenter Analysis.建立韩国人群常规实验室检测的儿科参考区间:一项回顾性多中心分析。
Ann Lab Med. 2021 Mar 1;41(2):155-170. doi: 10.3343/alm.2021.41.2.155.
8
Sex-divided reference intervals for mean platelet volume, platelet large cell ratio and plateletcrit using the Sysmex XN-10 automated haematology analyzer in a UK population.在英国人群中使用Sysmex XN - 10全自动血液分析仪得出的平均血小板体积、血小板大细胞比率和血小板压积的性别区分参考区间。
Hematol Transfus Cell Ther. 2019 Apr-Jun;41(2):153-157. doi: 10.1016/j.htct.2018.09.005. Epub 2018 Dec 31.
9
Direct Estimation of Reference Intervals for Thyroid Parameters in the Republic of Srpska.直接估算塞尔维亚共和国甲状腺参数的参考区间。
J Med Biochem. 2017 Apr 22;36(2):137-144. doi: 10.1515/jomb-2017-0008. eCollection 2017 Apr.
10
Reference Values for TSH and Free Thyroid Hormones in Healthy Pregnant Women in Poland: A Prospective, Multicenter Study.波兰健康孕妇促甲状腺激素和游离甲状腺激素的参考值:一项前瞻性多中心研究
Eur Thyroid J. 2017 Apr;6(2):82-88. doi: 10.1159/000453061. Epub 2017 Feb 3.