Cholewa Marcin, Płaczek Bartłomiej
Institute of Computer Science, University of Silesia, Będzińska 39, 41-205 Sosnowiec, Poland.
Entropy (Basel). 2020 Oct 19;22(10):1173. doi: 10.3390/e22101173.
This paper introduces a new method of estimating Shannon entropy. The proposed method can be successfully used for large data samples and enables fast computations to rank the data samples according to their Shannon entropy. Original definitions of positional entropy and integer entropy are discussed in details to explain the theoretical concepts that underpin the proposed approach. Relations between positional entropy, integer entropy and Shannon entropy were demonstrated through computational experiments. The usefulness of the introduced method was experimentally verified for various data samples of different type and size. The experimental results clearly show that the proposed approach can be successfully used for fast entropy estimation. The analysis was also focused on quality of the entropy estimation. Several possible implementations of the proposed method were discussed. The presented algorithms were compared with the existing solutions. It was demonstrated that the algorithms presented in this paper estimate the Shannon entropy faster and more accurately than the state-of-the-art algorithms.
本文介绍了一种估计香农熵的新方法。所提出的方法可成功用于大数据样本,并能进行快速计算以根据数据样本的香农熵对其进行排序。详细讨论了位置熵和整数熵的原始定义,以解释支撑所提出方法的理论概念。通过计算实验证明了位置熵、整数熵与香农熵之间的关系。针对不同类型和大小的各种数据样本,通过实验验证了所引入方法的实用性。实验结果清楚地表明,所提出的方法可成功用于快速熵估计。分析还集中在熵估计的质量上。讨论了所提出方法的几种可能实现方式。将所提出的算法与现有解决方案进行了比较。结果表明,本文提出的算法比现有最先进算法能更快、更准确地估计香农熵。