Lei Mingyu, Pelz Setu, Pachauri Shonali, Cai Wenjia
Department of Earth System Science, Institute for Global Change Studies, Ministry of Education Key Laboratory for Earth System Modeling, Tsinghua University, Beijing, 100084, China.
Tsinghua-Rio Tinto Joint Research Centre for Resources, Energy and Sustainable Development, International Joint Laboratory on Low Carbon Clean Energy Innovation, Laboratory for Low Carbon Energy, Tsinghua University, Beijing, 100084, China.
Sci Data. 2024 Dec 27;11(1):1436. doi: 10.1038/s41597-024-04304-x.
Projections of future income distributions at subnational levels are becoming increasingly important for a variety of analyses and evaluations. However, relevant datasets are currently limited. This study presents a methodological framework that introduces machine learning algorithms to a top-down approach used for generating income distribution datasets. We project per capita disposable income and income inequality for 31 Chinese provinces from 2020 to 2100, considering different scenarios based on China's local circumstances, and then estimate income distributions based on these. After accounting for necessary consistency between provincial, urban, and rural income datasets, we further generate the same data products at the urban and rural level for each province. We validate our projection results drawing on data from 2007-2023 for China's disposable income, data from 2007 to 2019 for provincial income inequality in China, as well as national income inequality data for the past 20 to 60 years from select developed countries. The proposed methodology provides flexibility to generate similar data products according to a user's specific needs. Our resulting datasets have several potential applications and can serve as inputs for research on drivers and impacts across social, economic, and environmental domains.
对次国家层面未来收入分配的预测对于各种分析和评估而言正变得越来越重要。然而,目前相关数据集有限。本研究提出了一个方法框架,将机器学习算法引入到一种用于生成收入分配数据集的自上而下的方法中。我们预测了2020年至2100年中国31个省份的人均可支配收入和收入不平等情况,考虑了基于中国国情的不同情景,然后在此基础上估计收入分配情况。在考虑省级、城市和农村收入数据集之间必要的一致性后,我们进一步为每个省份生成城乡层面的相同数据产品。我们利用2007 - 2023年中国可支配收入数据、2007年至2019年中国省级收入不平等数据以及部分发达国家过去20至60年的国民收入不平等数据来验证我们的预测结果。所提出的方法为根据用户特定需求生成类似数据产品提供了灵活性。我们所得的数据集有几个潜在应用,可作为社会、经济和环境领域驱动因素及影响研究的输入。