Azim Mim Mimusa, Majadi Nazia, Mazumder Peal
Department of Computer Science and Telecommunication Engineering, Noakhali Science and Technology University, Noakhali-3814, Bangladesh.
Heliyon. 2024 Feb 1;10(3):e25466. doi: 10.1016/j.heliyon.2024.e25466. eCollection 2024 Feb 15.
With the advancement of e-commerce and modern technological development, credit cards are widely used for both online and offline purchases, which has increased the number of daily fraudulent transactions. Many organizations and financial institutions worldwide lose billions of dollars annually because of credit card fraud. Due to the global distribution of both legitimate and fraudulent transactions, it is difficult to discern between the two. Furthermore, because only a small proportion of transactions are fraudulent, there is a problem of class imbalance. Hence, an effective fraud-detection methodology is required to sustain the reliability of the payment system. Machine learning has recently emerged as a viable substitute for identifying this type of fraud. However, ML approaches have difficulty identifying fraud with high prediction accuracy, while also decreasing misclassification costs due to the size of the imbalanced data. In this research, a soft voting ensemble learning approach for detecting credit card fraud on imbalanced data is proposed. To do this, the proposed approach is evaluated and compared with numerous sophisticated sampling techniques (i.e., oversampling, undersampling, and hybrid sampling) to overcome the class imbalance problem. We develop several credit card fraud classifiers, including ensemble classifiers, with and without sampling techniques. According to the experimental results, the proposed soft-voting approach outperforms individual classifiers. With a false negative rate (FNR) of 0.0306, it achieves a precision of 0.9870, recall of 0.9694, f1-score of 0.8764, and AUROC of 0.9936.
随着电子商务的发展和现代技术的进步,信用卡广泛应用于线上和线下购物,这使得每日欺诈交易数量增加。全球许多组织和金融机构每年因信用卡欺诈损失数十亿美元。由于合法交易和欺诈交易在全球范围内分布,很难区分两者。此外,由于只有一小部分交易是欺诈性的,存在类别不平衡问题。因此,需要一种有效的欺诈检测方法来维持支付系统的可靠性。机器学习最近已成为识别此类欺诈的可行替代方法。然而,机器学习方法难以以高预测准确率识别欺诈,同时还因不平衡数据的规模而降低误分类成本。在本研究中,提出了一种用于检测不平衡数据上信用卡欺诈的软投票集成学习方法。为此,对所提出的方法进行评估,并与众多复杂的采样技术(即过采样、欠采样和混合采样)进行比较,以克服类别不平衡问题。我们开发了几种信用卡欺诈分类器,包括有和没有采样技术的集成分类器。根据实验结果,所提出的软投票方法优于单个分类器。其误报率(FNR)为0.0306,精度为0.9870,召回率为0.9694,F1分数为0.8764,曲线下面积(AUROC)为0.9936。