Suppr超能文献

按异常值检测方法划分的变量分布

Distribution of variables by method of outlier detection.

作者信息

Finch W Holmes

机构信息

Department of Educational Psychology, Ball State University Muncie, IN, USA.

出版信息

Front Psychol. 2012 Jul 5;3:211. doi: 10.3389/fpsyg.2012.00211. eCollection 2012.

Abstract

The presence of outliers can very problematic in data analysis, leading statisticians to develop a wide variety of methods for identifying them in both the univariate and multivariate contexts. In case of the latter, perhaps the most popular approach has been Mahalanobis distance, where large values suggest an observation that is unusual as compared to the center of the data. However, researchers have identified problems with the application of this metric such that its utility may be limited in some situations. As a consequence, other methods for detecting outlying observations have been developed and studied. However, a number of these approaches, while apparently robust and useful have not made their way into general practice in the social sciences. Thus, the goal of this study was to describe some of these methods and demonstrate them using a well known dataset from a popular multivariate textbook widely used in the social sciences. Results demonstrated that the methods do indeed result in datasets with very different distributional characteristics. These results are discussed in light of how they might be used by researchers and practitioners.

摘要

异常值的存在在数据分析中可能会带来很大问题,促使统计学家开发了各种各样的方法来在单变量和多变量情况下识别它们。在多变量情况下,也许最流行的方法是马氏距离,其中较大的值表明一个观察值与数据中心相比是异常的。然而,研究人员已经发现了应用这种度量标准存在的问题,以至于它的效用在某些情况下可能会受到限制。因此,已经开发并研究了其他检测异常观测值的方法。然而,这些方法中的许多方法,虽然显然稳健且有用,但尚未在社会科学的一般实践中得到应用。因此,本研究的目的是描述其中一些方法,并使用社会科学中广泛使用的一本流行多变量教科书中的一个著名数据集对它们进行演示。结果表明,这些方法确实会导致具有非常不同分布特征的数据集。根据研究人员和从业者如何使用这些结果进行了讨论。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad63/3389806/aabdcc10bfb4/fpsyg-03-00211-g001.jpg

相似文献

1
Distribution of variables by method of outlier detection.
Front Psychol. 2012 Jul 5;3:211. doi: 10.3389/fpsyg.2012.00211. eCollection 2012.
4
Outlier detection in multivariate analytical chemical data.
Anal Chem. 1998 Jun 1;70(11):2372-9. doi: 10.1021/ac970763d.
5
Outlier modeling for spectral data reduction.
J Opt Soc Am A Opt Image Sci Vis. 2014 Jul 1;31(7):1445-52. doi: 10.1364/JOSAA.31.001445.
6
Locally centred Mahalanobis distance: a new distance measure with salient features towards outlier detection.
Anal Chim Acta. 2013 Jul 17;787:1-9. doi: 10.1016/j.aca.2013.04.034. Epub 2013 Apr 27.
7
Detecting outlying subjects in high-dimensional neuroimaging datasets with regularized minimum covariance determinant.
Med Image Comput Comput Assist Interv. 2011;14(Pt 3):264-71. doi: 10.1007/978-3-642-23626-6_33.
8
Protein-protein interaction site predictions with minimum covariance determinant and Mahalanobis distance.
J Theor Biol. 2017 Nov 21;433:57-63. doi: 10.1016/j.jtbi.2017.08.026. Epub 2017 Sep 1.

引用本文的文献

3
Dublin Anti-Bullying Self-Efficacy Scales: Bifactor and Item Response Theory Models.
J Interpers Violence. 2023 Jul;38(13-14):8721-8749. doi: 10.1177/08862605231155137. Epub 2023 Mar 3.
4
Reduced cortical cerebral blood flow in antipsychotic-free first-episode psychosis and relationship to treatment response.
Psychol Med. 2023 Aug;53(11):5235-5245. doi: 10.1017/S0033291722002288. Epub 2022 Aug 25.
5
Adaptation of Work Values Instrument in Indonesian Final Year University Students.
Front Psychol. 2022 May 12;13:858688. doi: 10.3389/fpsyg.2022.858688. eCollection 2022.
6
A practical guide for researchers and reviewers using the ABCD Study and other large longitudinal datasets.
Dev Cogn Neurosci. 2022 Jun;55:101115. doi: 10.1016/j.dcn.2022.101115. Epub 2022 May 20.
8
Examining the Factorial Structure of the in a Portuguese Sample.
Front Psychol. 2021 Jan 13;11:571734. doi: 10.3389/fpsyg.2020.571734. eCollection 2020.
10
Greater Tuberosity Fractures: Does Fracture Assessment and Treatment Recommendation Vary Based on Imaging Modality?
Clin Orthop Relat Res. 2016 May;474(5):1257-65. doi: 10.1007/s11999-016-4706-6. Epub 2016 Jan 21.

本文引用的文献

1
Robust Mokken Scale Analysis by Means of the Forward Search Algorithm for Outlier Detection.
Multivariate Behav Res. 2011 Feb 7;46(1):58-89. doi: 10.1080/00273171.2011.544227.
2
Patient classification as an outlier detection problem: an application of the One-Class Support Vector Machine.
Neuroimage. 2011 Oct 1;58(3):793-804. doi: 10.1016/j.neuroimage.2011.06.042. Epub 2011 Jun 24.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验