Huang Wei, Liu Yinke, Hu Peiqi, Ding Shiyu, Gao Shuhui, Zhang Ming
School of Management and Economics, North China University of Water Resources and Electric Power, Zhengzhou 450046, China.
Heliyon. 2023 Aug 30;9(9):e19525. doi: 10.1016/j.heliyon.2023.e19525. eCollection 2023 Sep.
Poverty eradication has always been a major challenge to global development and governance, which received widespread attention from each country. With the completion poverty alleviation task in 2020, relative poverty governance becomes an important issue to be solved in China urgently. Because of a large population, poor infrastructures, insufficient resources, and long-term uneven development raising the living standard of farmers in rural areas is critical to China's success in realizing moderate prosperity. Therefore, identifying the poor farmers, exploring the influence factors to relative poverty, and clarifying its effect mechanism in rural areas are significant for the subsequent poverty governance. Most of the previous studies adopted the method of apriori assuming the factor system and verifying the hypothesis. We innovatively constructed a relative poverty index system consistent with China's actual conditions, selecting all the possible variables that could affect relative poverty based on the existing literature, including individual characteristics, psychological endowment, and geographical environment, and rebuilt an experimental database. Then, through data processing and data analysis, the main factors influencing the relative poverty of farmers were systematically sorted out based on the machine learning method. Finally, 25 chosen influencing factors were discussed in detail. Research findings show that: 1) Machine learning algorithm is proved it could be well applied in relative poverty fields, especially XGBoost, which achieves 81.9% accuracy and the score of ROC_AUC reaches 0.819. 2) This study sheds light on many new research directions in applying machine learning for relative poverty research, besides, the paper offers an integral framework and beneficial reference for target identification using machine learning algorithms. 3) In addition, by utilizing the interpretable tools, the "black-box" of ML become transparent through PDP and SHAP explanation, it also reveals that machine learning models can readily handle the non-linear association relationship.
消除贫困一直是全球发展与治理面临的重大挑战,受到各国广泛关注。随着2020年脱贫任务的完成,相对贫困治理成为中国亟待解决的重要问题。由于人口众多、基础设施薄弱、资源不足以及长期发展不平衡,提高农村地区农民生活水平对中国实现全面小康的成功至关重要。因此,识别贫困农民、探索相对贫困的影响因素并阐明其在农村地区的作用机制,对后续的贫困治理具有重要意义。以往的大多数研究采用先验方法,假设因素体系并验证假设。我们创新性地构建了一个符合中国实际情况的相对贫困指标体系,根据现有文献选择所有可能影响相对贫困的变量,包括个体特征、心理禀赋和地理环境,并重建了一个实验数据库。然后,通过数据处理和数据分析,基于机器学习方法系统梳理了影响农民相对贫困的主要因素。最后,对选取的25个影响因素进行了详细讨论。研究结果表明:1)机器学习算法被证明可以很好地应用于相对贫困领域,尤其是XGBoost,其准确率达到81.9%,ROC_AUC得分达到0.819。2)本研究为将机器学习应用于相对贫困研究提供了许多新的研究方向,此外,本文为使用机器学习算法进行目标识别提供了一个完整的框架和有益的参考。3)此外,通过使用可解释工具,通过PDP和SHAP解释使机器学习的“黑箱”变得透明,这也表明机器学习模型能够轻松处理非线性关联关系。