Suppr超能文献

使用合成数据进行语言家乡检测的测试方法。

Testing methods of linguistic homeland detection using synthetic data.

机构信息

Leiden University Centre for Linguistics, Leiden University, Postbus 9515, Leiden 2300 RA, The Netherlands.

Laboratory for Quantitative Linguistics, Kazan Federal University, Kremlevskaya Street 18, Kazan 420000, Russia.

出版信息

Philos Trans R Soc Lond B Biol Sci. 2021 May 10;376(1824):20200202. doi: 10.1098/rstb.2020.0202. Epub 2021 Mar 22.

Abstract

Two families of quantitative methods have been used to infer geographical homelands of language families: Bayesian phylogeography and the 'diversity method'. Bayesian methods model how populations may have moved using a phylogenetic tree as a backbone, while the diversity method assumes that the geographical area where linguistic diversity is highest likely corresponds to the homeland. No systematic tests of the performances of the different methods in a linguistic context have so far been published. Here, we carry out performance testing by simulating language families, including branching structures and word lists, along with speaker populations moving in space. We test six different methods: two versions of BayesTraits; the relaxed random walk model of BEAST 2; our own RevBayes implementations of a fixed rate and a variable rates random walk model; and the diversity method. As a result of the tests, we propose a hierarchy of performance of the different methods. Factors such as geographical idiosyncrasies, incomplete sampling, tree imbalance and small family sizes all have a negative impact on performance, but mostly across the board, the performance hierarchy generally being impervious to such factors. This article is part of the theme issue 'Reconstructing prehistoric languages'.

摘要

两种定量方法被用于推断语言家族的地理起源

贝叶斯系统地理学和“多样性方法”。贝叶斯方法通过使用系统发育树作为主干来模拟种群可能的迁移方式,而多样性方法则假设语言多样性最高的地理区域可能对应于起源地。迄今为止,尚未在语言学背景下对不同方法的性能进行系统测试。在这里,我们通过模拟语言家族,包括分支结构和单词列表,以及在空间中移动的说话人群,进行性能测试。我们测试了六种不同的方法:两种版本的贝叶斯特征追踪法;BEAST2 的松弛随机漫步模型;我们自己的 RevBayes 实现的固定速率和可变速率随机漫步模型;以及多样性方法。作为测试的结果,我们提出了不同方法性能的层次结构。地理特征、不完全采样、树不平衡和小家族规模等因素都对性能有负面影响,但总体而言,性能层次结构通常不受这些因素的影响。本文是“重建史前语言”主题的一部分。

相似文献

1
Testing methods of linguistic homeland detection using synthetic data.使用合成数据进行语言家乡检测的测试方法。
Philos Trans R Soc Lond B Biol Sci. 2021 May 10;376(1824):20200202. doi: 10.1098/rstb.2020.0202. Epub 2021 Mar 22.
2
Cultural phylogeography of the Bantu Languages of sub-Saharan Africa.撒哈拉以南非洲班图语的文化系统地理学。
Proc Biol Sci. 2013 May 8;280(1762):20130695. doi: 10.1098/rspb.2013.0695. Print 2013 Jul 7.
3
Bayesian phylogeography of the Arawak expansion in lowland South America.贝叶斯低地南美群体扩张的系统地理学分析。
Proc Biol Sci. 2011 Sep 7;278(1718):2562-7. doi: 10.1098/rspb.2010.2579. Epub 2011 Jan 19.
4
The riddle of Tasmanian languages.塔斯马尼亚语之谜。
Proc Biol Sci. 2012 Nov 22;279(1747):4590-5. doi: 10.1098/rspb.2012.1842. Epub 2012 Sep 26.
9
Explaining the linguistic diversity of Sahul using population models.用人口模型解释萨赫尔地区的语言多样性。
PLoS Biol. 2009 Nov;7(11):e1000241. doi: 10.1371/journal.pbio.1000241. Epub 2009 Nov 17.
10
The history of number words in the world's languages-what have we learnt so far?世界语言中的数字词汇史——到目前为止我们学到了什么?
Philos Trans R Soc Lond B Biol Sci. 2021 May 10;376(1824):20200206. doi: 10.1098/rstb.2020.0206. Epub 2021 Mar 22.

引用本文的文献

3
Cross-linguistic conditions on word length.跨语言条件下的单词长度。
PLoS One. 2023 Jan 27;18(1):e0281041. doi: 10.1371/journal.pone.0281041. eCollection 2023.
5
Reconstructing prehistoric languages.重建史前语言。
Philos Trans R Soc Lond B Biol Sci. 2021 May 10;376(1824):20200187. doi: 10.1098/rstb.2020.0187. Epub 2021 Mar 22.

本文引用的文献

6
BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis.BEAST 2.5:一个用于贝叶斯进化分析的高级软件平台。
PLoS Comput Biol. 2019 Apr 8;15(4):e1006650. doi: 10.1371/journal.pcbi.1006650. eCollection 2019 Apr.
7
Recent advances in computational phylodynamics.计算系统发育动力学的最新进展。
Curr Opin Virol. 2018 Aug;31:24-32. doi: 10.1016/j.coviro.2018.08.009. Epub 2018 Sep 22.
10
The Potential of Automatic Word Comparison for Historical Linguistics.自动词比较在历史语言学中的潜力
PLoS One. 2017 Jan 27;12(1):e0170046. doi: 10.1371/journal.pone.0170046. eCollection 2017.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验