Suppr超能文献

使用套索逻辑回归集成的大规模不平衡信用评分

Large unbalanced credit scoring using Lasso-logistic regression ensemble.

作者信息

Wang Hong, Xu Qingsong, Zhou Lifeng

机构信息

School of Mathematics & Statistics, Central South University, Changsha, Hunan, China.

出版信息

PLoS One. 2015 Feb 23;10(2):e0117844. doi: 10.1371/journal.pone.0117844. eCollection 2015.

Abstract

Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

摘要

最近,针对信用评分问题,已经提出了各种基于不同基分类器的集成学习方法。然而,由于各种原因,使用逻辑回归作为基分类器的研究很少。在本文中,考虑到大量不平衡数据,我们探讨了使用正则化逻辑回归作为基分类器的集成学习来处理信用评分问题的合理性。在本研究中,首先通过聚类和装袋算法对数据进行平衡和多样化处理。然后,我们应用套索逻辑回归学习集成来评估信用风险。我们表明,所提出的算法在AUC和F值方面优于决策树、套索逻辑回归和随机森林等流行的信用评分模型。我们还为所提出的模型提供了两种重要性度量,以识别数据中的重要变量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b0d/4338292/fbd00856e5a4/pone.0117844.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验