Suppr超能文献

图灵视角下的熵估计。

Entropy estimation in Turing's perspective.

机构信息

Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA.

出版信息

Neural Comput. 2012 May;24(5):1368-89. doi: 10.1162/NECO_a_00266. Epub 2012 Feb 1.

Abstract

A new nonparametric estimator of Shannon's entropy on a countable alphabet is proposed and analyzed against the well-known plug-in estimator. The proposed estimator is developed based on Turing's formula, which recovers distributional characteristics on the subset of the alphabet not covered by a size-n sample. The fundamental switch in perspective brings about substantial gain in estimation accuracy for every distribution with finite entropy. In general, a uniform variance upper bound is established for the entire class of distributions with finite entropy that decays at a rate of O(ln(n)/n) compared to O([ln(n)]2/n) for the plug-in. In a wide range of subclasses, the variance of the proposed estimator converges at a rate of O(1/n), and this rate of convergence carries over to the convergence rates in mean squared errors in many subclasses. Specifically, for any finite alphabet, the proposed estimator has a bias decaying exponentially in n. Several new bias-adjusted estimators are also discussed.

摘要

提出了一种新的可数字母表上香农熵的非参数估计器,并将其与著名的插件估计器进行了对比分析。所提出的估计器是基于图灵公式开发的,该公式可以恢复大小为 n 的样本未涵盖的字母表子集上的分布特征。这种基本视角的转变为每个具有有限熵的分布带来了显著的估计精度增益。一般来说,对于具有有限熵的整个分布类,建立了一个统一的方差上限,与插件相比,该上限以 O(ln(n)/n)的速度衰减,而插件的速度为 O([ln(n)]2/n)。在广泛的子类中,提议的估计器的方差收敛速度为 O(1/n),并且这种收敛速度在许多子类中的均方误差收敛速度中也得到了体现。具体来说,对于任何有限的字母表,所提出的估计器的偏差以指数方式在 n 中衰减。还讨论了几个新的偏差调整估计器。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验