Zhang Zhiyi
Department of Mathematics and Statistics, UNC Charlotte, Charlotte, NC 28223, USA.
Entropy (Basel). 2023 Jul 13;25(7):1060. doi: 10.3390/e25071060.
Inspired by the development in modern data science, a shift is increasingly visible in the foundation of statistical inference, away from a real space, where random variables reside, toward a nonmetrized and nonordinal alphabet, where more general random elements reside. While statistical inferences based on random variables are theoretically well supported in the rich literature of probability and statistics, inferences on alphabets, mostly by way of various entropies and their estimation, are less systematically supported in theory. Without the familiar notions of neighborhood, real or complex moments, tails, et cetera, associated with random variables, probability and statistics based on random elements on alphabets need more attention to foster a sound framework for rigorous development of entropy-based statistical exercises. In this article, several basic elements of entropic statistics are introduced and discussed, including notions of general entropies, entropic sample spaces, entropic distributions, entropic statistics, entropic multinomial distributions, entropic moments, and entropic basis, among other entropic objects. In particular, an entropic-moment-generating function is defined and it is shown to uniquely characterize the underlying distribution in entropic perspective, and, hence, all entropies. An entropic version of the Glivenko-Cantelli convergence theorem is also established.
受现代数据科学发展的启发,统计推断的基础正日益明显地发生转变,从随机变量所在的实数空间,转向更一般随机元素所在的非度量且非有序字母表。虽然基于随机变量的统计推断在丰富的概率与统计文献中有很好的理论支持,但关于字母表的推断,大多通过各种熵及其估计来进行,在理论上的系统性支持较少。没有与随机变量相关的诸如邻域、实矩或复矩、尾部等熟悉概念,基于字母表上随机元素的概率和统计需要更多关注,以构建一个用于严格发展基于熵的统计方法的合理框架。在本文中,介绍并讨论了熵统计的几个基本要素,包括一般熵、熵样本空间、熵分布、熵统计、熵多项分布、熵矩和熵基等其他熵对象的概念。特别地,定义了一个熵矩生成函数,并表明它从熵的角度唯一地刻画了基础分布,进而刻画了所有熵。还建立了Glivenko - Cantelli收敛定理的熵版本。