Suppr超能文献

高维错误设定二元分类中的预测与变量选择

Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification.

作者信息

Furmańczyk Konrad, Rejchel Wojciech

机构信息

Institute of Information Technology, Warsaw University of Life Sciences (SGGW), Nowoursynowska 159, 02-776 Warszawa, Poland.

Faculty of Mathematics and Computer Science, Nicolaus Copernicus University, Chopina 12/18, 87-100 Toruń, Poland.

出版信息

Entropy (Basel). 2020 May 13;22(5):543. doi: 10.3390/e22050543.

Abstract

In this paper, we consider prediction and variable selection in the misspecified binary classification models under the high-dimensional scenario. We focus on two approaches to classification, which are computationally efficient, but lead to model misspecification. The first one is to apply penalized logistic regression to the classification data, which possibly do not follow the logistic model. The second method is even more radical: we just treat class labels of objects as they were numbers and apply penalized linear regression. In this paper, we investigate thoroughly these two approaches and provide conditions, which guarantee that they are successful in prediction and variable selection. Our results hold even if the number of predictors is much larger than the sample size. The paper is completed by the experimental results.

摘要

在本文中,我们考虑高维情形下误设二元分类模型中的预测和变量选择问题。我们聚焦于两种分类方法,它们计算效率高,但会导致模型误设。第一种方法是将惩罚逻辑回归应用于分类数据,而这些数据可能并不遵循逻辑模型。第二种方法更为激进:我们仅仅将对象的类别标签当作数字来处理,并应用惩罚线性回归。在本文中,我们深入研究这两种方法,并给出条件,以确保它们在预测和变量选择方面取得成功。即使预测变量的数量远大于样本量,我们的结果依然成立。本文最后给出了实验结果。

相似文献

4
On the robustness of the adaptive lasso to model misspecification.关于自适应套索对模型误设的稳健性。
Biometrika. 2012 Sep;99(3):717-731. doi: 10.1093/biomet/ass027. Epub 2012 Jul 11.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验