Suppr超能文献

信息散度的集成估计

Ensemble Estimation of Information Divergence .

作者信息

Moon Kevin R, Sricharan Kumar, Greenewald Kristjan, Hero Alfred O

机构信息

Genetics Department and Applied Math Program, Yale University, New Haven, CT 06520, USA.

Intuit Inc., Mountain View, CA 94043, USA.

出版信息

Entropy (Basel). 2018 Jul 27;20(8):560. doi: 10.3390/e20080560.

Abstract

Recent work has focused on the problem of nonparametric estimation of information divergence functionals between two continuous random variables. Many existing approaches require either restrictive assumptions about the density support set or difficult calculations at the support set boundary which must be known a priori. The mean squared error (MSE) convergence rate of a leave-one-out kernel density plug-in divergence functional estimator for general bounded density support sets is derived where knowledge of the support boundary, and therefore, the boundary correction is not required. The theory of optimally weighted ensemble estimation is generalized to derive a divergence estimator that achieves the parametric rate when the densities are sufficiently smooth. Guidelines for the tuning parameter selection and the asymptotic distribution of this estimator are provided. Based on the theory, an empirical estimator of Rényi-α divergence is proposed that greatly outperforms the standard kernel density plug-in estimator in terms of mean squared error, especially in high dimensions. The estimator is shown to be robust to the choice of tuning parameters. We show extensive simulation results that verify the theoretical results of our paper. Finally, we apply the proposed estimator to estimate the bounds on the Bayes error rate of a cell classification problem.

摘要

近期的工作聚焦于两个连续随机变量之间信息散度泛函的非参数估计问题。许多现有方法要么对密度支撑集有严格假设,要么在支撑集边界处进行困难的计算,而支撑集边界必须事先已知。对于一般有界密度支撑集,推导了留一法核密度代入散度泛函估计器的均方误差(MSE)收敛速率,这里不需要支撑集边界的知识,因此也不需要边界校正。最优加权总体估计理论被推广以推导一个散度估计器,当密度足够光滑时,该估计器能达到参数速率。给出了该估计器调优参数选择的指导方针及其渐近分布。基于该理论,提出了一种Rényi-α散度的经验估计器,在均方误差方面,尤其是在高维情况下,该估计器大大优于标准的核密度代入估计器。结果表明该估计器对调优参数的选择具有鲁棒性。我们展示了广泛的模拟结果,验证了本文的理论结果。最后,我们将所提出的估计器应用于估计细胞分类问题的贝叶斯错误率边界。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d43/7513085/68f199fcab08/entropy-20-00560-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验