Suppr超能文献

通过机器学习提高历史人均国内生产总值估计值的可得性。

Augmenting the availability of historical GDP per capita estimates through machine learning.

作者信息

Koch Philipp, Stojkoski Viktor, A Hidalgo César

机构信息

Center for Collective Learning, Artificial and Natural Intelligence Toulouse Institute, Institut de Recherche en Informatique de Toulouse, Université de Toulouse, 31000 Toulouse, France.

EcoAustria-Institute for Economic Research, 1030 Vienna, Austria.

出版信息

Proc Natl Acad Sci U S A. 2024 Sep 24;121(39):e2402060121. doi: 10.1073/pnas.2402060121. Epub 2024 Sep 16.

Abstract

Can we use data on the biographies of historical figures to estimate the GDP per capita of countries and regions? Here, we introduce a machine learning method to estimate the GDP per capita of dozens of countries and hundreds of regions in Europe and North America for the past seven centuries starting from data on the places of birth, death, and occupations of hundreds of thousands of historical figures. We build an elastic net regression model to perform feature selection and generate out-of-sample estimates that explain 90% of the variance in known historical income levels. We use this model to generate GDP per capita estimates for countries, regions, and time periods for which these data are not available and externally validate our estimates by comparing them with four proxies of economic output: urbanization rates in the past 500 y, body height in the 18 century, well-being in 1850, and church building activity in the 14 and 15 century. Additionally, we show our estimates reproduce the well-known reversal of fortune between southwestern and northwestern Europe between 1300 and 1800 and find this is largely driven by countries and regions engaged in Atlantic trade. These findings validate the use of fine-grained biographical data as a method to augment historical GDP per capita estimates. We publish our estimates with CI together with all collected source data in a comprehensive dataset.

摘要

我们能否利用历史人物传记数据来估算国家和地区的人均国内生产总值(GDP)?在此,我们引入一种机器学习方法,从数十万历史人物的出生地、死亡地和职业数据出发,估算过去七个世纪欧洲和北美的数十个国家及数百个地区的人均GDP。我们构建了一个弹性网络回归模型来进行特征选择,并生成样本外估计值,这些估计值能够解释已知历史收入水平中90%的方差。我们使用该模型生成那些尚无此类数据的国家、地区和时间段的人均GDP估计值,并通过将这些估计值与经济产出的四个代理指标进行比较来进行外部验证:过去500年的城市化率、18世纪的身高、1850年的福祉以及14和15世纪的教堂建设活动。此外,我们的估计值再现了1300年至1800年间欧洲西南部和西北部之间众所周知的命运逆转,并发现这在很大程度上是由参与大西洋贸易的国家和地区所驱动的。这些发现验证了使用细粒度传记数据作为一种增加历史人均GDP估计值的方法。我们将带有置信区间的估计值以及所有收集到的源数据一起发布在一个综合数据集中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6ea/11441543/5215185509d7/pnas.2402060121fig01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验