Martín Jacinto, Parra María Isabel, Pizarro Mario Martínez, Sanjuán Eva López
Departamento de Matemáticas, Facultad de Ciencias, Universidad de Extremadura, 06006 Badajoz, Spain.
Departamento de Matemáticas, Centro Universitario de Mérida, Universidad de Extremadura, 06800 Mérida, Spain.
Entropy (Basel). 2022 Jan 25;24(2):178. doi: 10.3390/e24020178.
In the parameter estimation of limit extreme value distributions, most employed methods only use some of the available data. Using the peaks-over-threshold method for Generalized Pareto Distribution (GPD), only the observations above a certain threshold are considered; therefore, a big amount of information is wasted. The aim of this work is to make the most of the information provided by the observations in order to improve the accuracy of Bayesian parameter estimation. We present two new Bayesian methods to estimate the parameters of the GPD, taking into account the whole data set from the baseline distribution and the existing relations between the baseline and the limit GPD parameters in order to define highly informative priors. We make a comparison between the Bayesian Metropolis-Hastings algorithm with data over the threshold and the new methods when the baseline distribution is a stable distribution, whose properties assure we can reduce the problem to study standard distributions and also allow us to propose new estimators for the parameters of the tail distribution. Specifically, three cases of stable distributions were considered: Normal, Lévy and Cauchy distributions, as main examples of the different behaviors of the tails of a distribution. Nevertheless, the methods would be applicable to many other baseline distributions through finding relations between baseline and GPD parameters via studies of simulations. To illustrate this situation, we study the application of the methods with real data of air pollution in Badajoz (Spain), whose baseline distribution fits a Gamma, and show that the baseline methods improve estimates compared to the Bayesian Metropolis-Hastings algorithm.
在极限极值分布的参数估计中,大多数常用方法仅使用部分可用数据。使用广义帕累托分布(GPD)的阈值之上峰值方法时,仅考虑高于某个阈值的观测值;因此,大量信息被浪费。这项工作的目的是充分利用观测值提供的信息,以提高贝叶斯参数估计的准确性。我们提出两种新的贝叶斯方法来估计GPD的参数,考虑来自基线分布的整个数据集以及基线与极限GPD参数之间的现有关系,以便定义信息丰富的先验。当基线分布为稳定分布时,我们将阈值之上数据的贝叶斯梅特罗波利斯-黑斯廷斯算法与新方法进行比较,稳定分布的特性确保我们可以将问题简化为研究标准分布,还使我们能够为尾部分布的参数提出新的估计量。具体而言,考虑了稳定分布的三种情况:正态分布、列维分布和柯西分布,作为分布尾部不同行为的主要示例。然而,通过模拟研究找到基线与GPD参数之间的关系,这些方法将适用于许多其他基线分布。为说明这种情况,我们研究了这些方法在西班牙巴达霍斯空气污染实际数据中的应用,其基线分布符合伽马分布,并表明与贝叶斯梅特罗波利斯-黑斯廷斯算法相比,基线方法改进了估计。