Hajek Petr, Abedin Mohammad Zoynul, Sivarajah Uthayasankar
Science and Research Centre, Faculty of Economics and Administration, University of Pardubice, Studentska 84, Pardubice, 532 10 Czech Republic.
Department of Finance, Performance & Marketing, Teesside University International Business School, Teesside University, Middlesbrough, TS1 3BX Tees Valley UK.
Inf Syst Front. 2022 Oct 14:1-19. doi: 10.1007/s10796-022-10346-6.
Mobile payment systems are becoming more popular due to the increase in the number of smartphones, which, in turn, attracts the interest of fraudsters. Extant research has therefore developed various fraud detection methods using supervised machine learning. However, sufficient labeled data are rarely available and their detection performance is negatively affected by the extreme class imbalance in financial fraud data. The purpose of this study is to propose an XGBoost-based fraud detection framework while considering the financial consequences of fraud detection systems. The framework was empirically validated on a large dataset of more than 6 million mobile transactions. To demonstrate the effectiveness of the proposed framework, we conducted a comparative evaluation of existing machine learning methods designed for modeling imbalanced data and outlier detection. The results suggest that in terms of standard classification measures, the proposed semi-supervised ensemble model integrating multiple unsupervised outlier detection algorithms and an XGBoost classifier achieves the best results, while the highest cost savings can be achieved by combining random under-sampling and XGBoost methods. This study has therefore financial implications for organizations to make appropriate decisions regarding the implementation of effective fraud detection systems.
由于智能手机数量的增加,移动支付系统越来越受欢迎,这反过来又吸引了欺诈者的兴趣。因此,现有研究使用监督式机器学习开发了各种欺诈检测方法。然而,很少有足够的标记数据,并且它们的检测性能受到金融欺诈数据中极端类不平衡的负面影响。本研究的目的是在考虑欺诈检测系统的财务后果的同时,提出一个基于XGBoost的欺诈检测框架。该框架在一个超过600万笔移动交易的大型数据集上进行了实证验证。为了证明所提出框架的有效性,我们对为不平衡数据建模和异常值检测设计的现有机器学习方法进行了比较评估。结果表明,就标准分类度量而言,所提出的集成多种无监督异常值检测算法和XGBoost分类器的半监督集成模型取得了最佳结果,而通过结合随机欠采样和XGBoost方法可以实现最高的成本节约。因此,本研究对组织在实施有效的欺诈检测系统方面做出适当决策具有财务意义。