Max Planck Institute for Evolutionary Anthropology, Department of Linguistic and Cultural Evolution, Leipzig, 04103, Germany.
Uppsala University, Department of Linguistics and Philology, Uppsala, 751 26, Sweden.
Sci Data. 2024 Aug 29;11(1):949. doi: 10.1038/s41597-024-03816-w.
Wurm & Hattori's Language Atlas of the Pacific Area describes the geographic speaker areas of the languages and language varieties spoken in the Pacific. Thanks to the efforts of the Electronic Cultural Atlas Initiative, this monumental piece of work has been available in digital form for over 15 years. But lacking proper identification of language varieties, this digitized data was largely unusable for today's research methods. We turned ECAI's digitized artefacts of the Language Atlas into an open, reusable geo-referenced dataset of speaker area polygons for a quarter of the world's languages. This allows for much more refined analysis methods to, for example, analyse language contact in the area of the world with the highest linguistic diversity. We also describe a number of tool applications and quality checks which may be useful for methodological development in similar digitization efforts.
沃尔姆和服部的《太平洋地区语言图集》描述了太平洋地区语言和语言变体的地理分布情况。得益于电子文化地图集倡议的努力,这部巨著已经以数字化形式存在了 15 年以上。但是,由于缺乏对语言变体的适当识别,这些数字化数据在很大程度上无法用于当今的研究方法。我们将《语言图集》中的电子文化地图集制品转化为四分之一世界语言的开放式、可重复使用的、具有地理位置参照的发言人区域多边形数据集。这使得可以采用更加精细的分析方法,例如,分析语言在世界上语言多样性最高的地区的接触情况。我们还描述了一些工具应用程序和质量检查,这些对于类似数字化工作中的方法学发展可能是有用的。