Gündüz Necla, Aydın Celal
Faculty of Science, Department of Statistics, Gazi University, Ankara, TURKEY.
J Appl Stat. 2021 Jul 11;48(13-15):2239-2258. doi: 10.1080/02664763.2021.1944999. eCollection 2021.
In this study, we provide simulation-based exploration and characterization of the two most crucial kernel density functionals that play a central role in kernel density estimation, considering the probability density functions that are members of the location-scale family. Kernel density functional estimates are known to rely on the choice of preliminary bandwidth. estimators are commonly used to obtain preliminary bandwidth estimates, with the assumption that the data come from normal distribution. Here, we present an alternative approach, called the estimators, to obtain preliminary bandwidth estimates. In this approach, data are assumed to come from a Cauchy distribution. Furthermore, analysis results related to the sampling distribution of bandwidth estimators based on the normal- and Cauchy-scale approaches are presented. As a case study, we provide a comprehensive characterization of different contamination levels with a simulation study constructed for the random samples from normal distributions with various parameters and various contamination levels. The proposed preliminary bandwidth selection shows lower variance in both mixture and contaminated data in our simulations. Besides, functional bandwidth presents results similar to the simulation results in the applications we made on the real data set.
在本研究中,考虑到属于位置 - 尺度族的概率密度函数,我们对在核密度估计中起核心作用的两个最关键的核密度泛函进行了基于模拟的探索和表征。已知核密度泛函估计依赖于初步带宽的选择。估计器通常用于获得初步带宽估计,假设数据来自正态分布。在这里,我们提出了一种称为估计器的替代方法来获得初步带宽估计。在这种方法中,假设数据来自柯西分布。此外,还给出了基于正态和柯西尺度方法的带宽估计器抽样分布的分析结果。作为一个案例研究,我们通过为具有各种参数和各种污染水平的正态分布随机样本构建的模拟研究,对不同污染水平进行了全面表征。在我们的模拟中,所提出的初步带宽选择在混合数据和受污染数据中均显示出较低的方差。此外,功能带宽在我们对真实数据集进行的应用中呈现出与模拟结果相似的结果。