Suppr超能文献

信息校正估计:一种减少泛化误差的参数估计方法。

Information-Corrected Estimation: A Generalization Error Reducing Parameter Estimation Method.

作者信息

Dixon Matthew, Ward Tyler

机构信息

Department of Applied Mathematics, Illinois Institute of Technology, Chicago, IL 60616, USA.

Department of Financial Engineering, NYU Tandon School of Engineering, New York, NY 11201, USA.

出版信息

Entropy (Basel). 2021 Oct 28;23(11):1419. doi: 10.3390/e23111419.

Abstract

Modern computational models in supervised machine learning are often highly parameterized universal approximators. As such, the value of the parameters is unimportant, and only the out of sample performance is considered. On the other hand much of the literature on model estimation assumes that the parameters themselves have intrinsic value, and thus is concerned with bias and variance of parameter estimates, which may not have any simple relationship to out of sample model performance. Therefore, within supervised machine learning, heavy use is made of ridge regression (i.e., L2 regularization), which requires the the estimation of hyperparameters and can be rendered ineffective by certain model parameterizations. We introduce an objective function which we refer to as Information-Corrected Estimation (ICE) that reduces KL divergence based generalization error for supervised machine learning. ICE attempts to directly maximize a corrected likelihood function as an estimator of the KL divergence. Such an approach is proven, theoretically, to be effective for a wide class of models, with only mild regularity restrictions. Under finite sample sizes, this corrected estimation procedure is shown experimentally to lead to significant reduction in generalization error compared to maximum likelihood estimation and L2 regularization.

摘要

监督式机器学习中的现代计算模型通常是高度参数化的通用近似器。因此,参数的值并不重要,只考虑样本外性能。另一方面,许多关于模型估计的文献假设参数本身具有内在价值,因此关注参数估计的偏差和方差,而这可能与样本外模型性能没有任何简单关系。因此,在监督式机器学习中,大量使用岭回归(即L2正则化),这需要估计超参数,并且在某些模型参数化下可能会失效。我们引入了一个目标函数,我们称之为信息校正估计(ICE),它可以减少监督式机器学习中基于KL散度的泛化误差。ICE试图直接最大化一个校正后的似然函数作为KL散度的估计器。从理论上讲,这种方法被证明对广泛的一类模型是有效的,只需要适度的正则性限制。在有限样本量下,实验表明,与最大似然估计和L2正则化相比,这种校正估计过程能显著降低泛化误差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/952e/8621511/8809761d46b6/entropy-23-01419-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验