Zhou Yong-Feng, Lin Yu-Xuan, Fang Kai-Tai, Yin Hong
School of Mathematics, Renmin University of China, No. 59, Zhongguancun Street, Haidian District, Beijing 100872, China.
Research Center for Frontier Fundamental Studies, Zhejiang Lab, Kechuang Avenue, Zhongtai Sub-District, Yuhang District, Hangzhou 311121, China.
Entropy (Basel). 2024 Oct 22;26(11):889. doi: 10.3390/e26110889.
Assuming the underlying statistical distribution of data is critical in information theory, as it impacts the accuracy and efficiency of communication and the definition of entropy. The real-world data are widely assumed to follow the normal distribution. To better comprehend the skewness of the data, many models more flexible than the normal distribution have been proposed, such as the generalized alpha skew- (GAST) distribution. This paper studies some properties of the GAST distribution, including the calculation of the moments, and the relationship between the number of peaks and the GAST parameters with some proofs. For complex probability distributions, representative points (RPs) are useful due to the convenience of manipulation, computation and analysis. The relative entropy of two probability distributions could have been a good criterion for the purpose of generating RPs of a specific distribution but is not popularly used due to computational complexity. Hence, this paper only provides three ways to obtain RPs of the GAST distribution, Monte Carlo (MC), quasi-Monte Carlo (QMC), and mean square error (MSE). The three types of RPs are utilized in estimating moments and densities of the GAST distribution with known and unknown parameters. The MSE representative points perform the best among all case studies. For unknown parameter cases, a revised maximum likelihood estimation (MLE) method of parameter estimation is compared with the plain MLE method. It indicates that the revised MLE method is suitable for the GAST distribution having a unimodal or unobvious bimodal pattern. This paper includes two real-data applications in which the GAST model appears adaptable to various types of data.
在信息论中,假设数据的基础统计分布至关重要,因为它会影响通信的准确性和效率以及熵的定义。现实世界中的数据被广泛假定服从正态分布。为了更好地理解数据的偏度,人们提出了许多比正态分布更灵活的模型,例如广义α偏态(GAST)分布。本文研究了GAST分布的一些性质,包括矩的计算,以及峰值数量与GAST参数之间的关系,并给出了一些证明。对于复杂的概率分布,代表性点(RP)因其便于操作、计算和分析而很有用。两个概率分布的相对熵本可作为生成特定分布RP的一个很好的标准,但由于计算复杂性而未被广泛使用。因此,本文仅提供了三种获取GAST分布RP的方法,即蒙特卡罗(MC)、拟蒙特卡罗(QMC)和均方误差(MSE)。这三种类型的RP被用于估计已知和未知参数的GAST分布的矩和密度。在所有案例研究中,MSE代表性点表现最佳。对于未知参数情况,将一种修正的最大似然估计(MLE)参数估计方法与普通MLE方法进行了比较。结果表明,修正的MLE方法适用于具有单峰或不明显双峰模式的GAST分布。本文包括两个实际数据应用,其中GAST模型似乎适用于各种类型的数据。