Fakhar Bilal Syed, Ali Almazroi Abdulwahab, Bashir Saba, Hassan Khan Farhan, Ali Almazroi Abdulaleem
Computer Science Department, Federal Urdu University of Arts, Science and Technology, Islamabad, Pakistan.
University of Jeddah, College of Computing and Information Technology at Khulais, Department of Information Technology, Jeddah, Saudi Arabia.
PeerJ Comput Sci. 2022 Feb 22;8:e854. doi: 10.7717/peerj-cs.854. eCollection 2022.
Mobile communication has become a dominant medium of communication over the past two decades. New technologies and competitors are emerging rapidly and churn prediction has become a great concern for telecom companies. A customer churn prediction model can provide the accurate identification of potential churners so that a retention solution may be provided to them. The proposed churn prediction model is a hybrid model that is based on a combination of clustering and classification algorithms using an ensemble. First, different clustering algorithms (. K-means, K-medoids, X-means and random clustering) were evaluated individually on two churn prediction datasets. Then hybrid models were introduced by combining the clusters with seven different classification algorithms individually and then evaluations were performed using ensembles. The proposed research was evaluated on two different benchmark telecom data sets obtained from GitHub and Bigml platforms. The analysis of results indicated that the proposed model attained the highest prediction accuracy of 94.7% on the GitHub dataset and 92.43% on the Bigml dataset. State of the art comparison was also performed using the proposed model. The proposed model performed significantly better than state of the art churn prediction models.
在过去二十年中,移动通信已成为主要的通信媒介。新技术和竞争对手迅速涌现,客户流失预测已成为电信公司极为关注的问题。客户流失预测模型可以准确识别潜在的流失客户,以便为他们提供留存解决方案。所提出的流失预测模型是一种混合模型,它基于使用集成方法的聚类和分类算法的组合。首先,在两个流失预测数据集上分别评估了不同的聚类算法(K均值、K中心点、X均值和随机聚类)。然后通过将聚类分别与七种不同的分类算法相结合来引入混合模型,接着使用集成方法进行评估。所提出的研究在从GitHub和Bigml平台获得的两个不同的基准电信数据集上进行了评估。结果分析表明,所提出的模型在GitHub数据集上达到了94.7%的最高预测准确率,在Bigml数据集上达到了92.43%。还使用所提出的模型进行了与现有技术的比较。所提出的模型的表现明显优于现有技术的客户流失预测模型。