• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

印欧语同源关系数据集。

The Indo-European Cognate Relationships dataset.

作者信息

Anderson Cormac, Scarborough Matthew, Jocz Lechosław, Kümmel Martin Joachim, Jügel Thomas, Irslinger Britta, Pooth Roland, Liljegren Henrik, Strand Richard F, Haig Geoffrey, Geupel Ulrich, Macak Martin, Kim Ronald I, Anonby Erik, Pronk Tijmen, Belyaev Oleg, Dewey-Findell Tonya Kim, Boutilier Matthew, Freiberg Cassandra, Tegethoff Robert, Serangeli Matilde, Stroński Krzysztof, Falileyev Alexander, Liosis Nikos, Schulte Kim, Gupta Ganesh Kumar, Izadifar Raheleh, Markus Patrycja, Williams Nicholas, Loi Simone, Sims-Williams Nicholas, Findell Martin, Adibifar Shirin, Abete Giovanni, Atanasov Petar, Baiwir Esther, Bastardas Maria-Reina, Benkato Adam, Bevevino Lisa Shugert, Buchi Éva, Cadorini Giorgio, Cathcart Chundra, Cheveau Loïc, Christodoulou Charalambos, Delorme Jérémie, Dworkin Steven N, Ekici Deniz, Farridnejad Shervin, Gheitasi Mojtaba, Hammarström Harald, Hewitt Steve, Khan Afsar Ali, Khan Muhammad Kamal, Khokhlova Liudmila, Kim Deborah, Lewin Christopher, Lushaj Borana, Mahmoudveysi Parvin, Mahommadirad Masoud, Mersch Sam, Mustafa Baydaa, Nemati Fatemeh, Nourzaei Maryam, Muircheartaigh Peadar Ó, Oogjen Virginia, Ourang Muhammed, Pagan Heather, Palmer Timothy S, Pepper Steve, Purandare Mandar, Rehman Khwaja, Rhys Guto, Røyneland Unn, Sagar Muhammad Zaman, Sandstedt Jade Jørgen, Steensland Lars, Taheri-Ardali Mortaza, Talebi-Dastenaei Mahnaz, Tittel Sabine, Tresoldi Tiago, de Vaan Michiel, Verkerk Annemarie, Versloot Arjen, Videsott Paul, Vuletić Nikola, Widmer Manuel, Zeini Arash, Bibiko Hans-Jörg, Runge Fiona, Gray Russell D, Heggarty Paul

机构信息

Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103, Leipzig, Germany.

Surrey Morphology Group, University of Surrey, Guildford, Surrey, GU2 7XH, UK.

出版信息

Sci Data. 2025 Sep 2;12(1):1541. doi: 10.1038/s41597-025-05445-3.

DOI:10.1038/s41597-025-05445-3
PMID:40897732
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12405575/
Abstract

The Indo-European Cognate Relationships (IE-CoR) dataset is an open-access relational dataset showing how related, inherited words ('cognates') pattern across 160 languages of the Indo-European family. IE-CoR is intended as a benchmark dataset for computational research into the evolution of the Indo-European languages. It is structured around 170 reference meanings in core lexicon, and contains 25731 lexeme entries, analysed into 4981 cognate sets. Novel, dedicated structures are used to code all known cases of horizontal transfer. All 13 main documented clades of Indo-European, and their main subclades, are well represented. Time calibration data for each language are also included, as are relevant geographical and social metadata. Data collection was performed by an expert consortium of 89 linguists drawing on 355 cited sources. The dataset is extendable to further languages and meanings and follows the Cross-Linguistic Data Format (CLDF) protocols for linguistic data. It is designed to be interoperable with other cross-linguistic datasets and catalogues, and provides a reference framework for similar initiatives for other language families.

摘要

印欧语系同源关系(IE-CoR)数据集是一个开放获取的关系型数据集,展示了印欧语系160种语言中相关的、继承而来的词汇(“同源词”)的分布模式。IE-CoR旨在作为印欧语系语言演化计算研究的基准数据集。它围绕核心词汇中的170个参考意义构建,包含25731个词位条目,被分析为4981个同源词集。使用新颖的专用结构对所有已知的水平转移情况进行编码。印欧语系所有13个主要的有文献记载的分支及其主要子分支都有很好的体现。还包括每种语言的时间校准数据以及相关的地理和社会元数据。数据收集由一个由89位语言学家组成的专家团队进行,参考了355个引用来源。该数据集可扩展到更多语言和意义,并遵循语言数据的跨语言数据格式(CLDF)协议。它旨在与其他跨语言数据集和目录实现互操作,并为其他语系的类似项目提供参考框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff39/12405575/a3e960ddf27a/41597_2025_5445_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff39/12405575/46f7909b1d67/41597_2025_5445_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff39/12405575/a3caf95f8a59/41597_2025_5445_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff39/12405575/a3e960ddf27a/41597_2025_5445_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff39/12405575/46f7909b1d67/41597_2025_5445_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff39/12405575/a3caf95f8a59/41597_2025_5445_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff39/12405575/a3e960ddf27a/41597_2025_5445_Fig3_HTML.jpg

相似文献

1
The Indo-European Cognate Relationships dataset.印欧语同源关系数据集。
Sci Data. 2025 Sep 2;12(1):1541. doi: 10.1038/s41597-025-05445-3.
2
MarkVCID cerebral small vessel consortium: I. Enrollment, clinical, fluid protocols.马克 VCID 脑小血管联盟:一、入组、临床、液体方案。
Alzheimers Dement. 2021 Apr;17(4):704-715. doi: 10.1002/alz.12215. Epub 2021 Jan 21.
3
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
4
Arabic Aphasia Research Through a Clinical and Linguistic Lens: A Systematic Review of Current Limitations and Future Directions.从临床和语言视角看阿拉伯语失语症研究:对当前局限性和未来方向的系统综述
Int J Lang Commun Disord. 2025 Jul-Aug;60(4):e70064. doi: 10.1111/1460-6984.70064.
5
The rise of gemination in Celtic.凯尔特语中连音现象的兴起。
Open Res Eur. 2024 Feb 8;3:24. doi: 10.12688/openreseurope.15400.2. eCollection 2023.
6
A tutorial on discourse analysis in healthy and pathological ageing.健康和病态老化中的话语分析教程。
Int J Lang Commun Disord. 2024 Jan-Feb;59(1):94-109. doi: 10.1111/1460-6984.12919. Epub 2023 Jun 22.
7
Short-Term Memory Impairment短期记忆障碍
8
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
9
A Scoping Review of the Observed and Perceived Functional Impacts Associated With Language and Learning Disorders in School-Aged Children.一项关于学龄儿童语言和学习障碍相关的观察到的和感知到的功能影响的范围综述。
Int J Lang Commun Disord. 2025 Jul-Aug;60(4):e70086. doi: 10.1111/1460-6984.70086.
10
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

本文引用的文献

1
Language trees with sampled ancestors support a hybrid model for the origin of Indo-European languages.带有采样祖先的语言树支持印欧语系起源的混合模型。
Science. 2023 Jul 28;381(6656):eabg0818. doi: 10.1126/science.abg0818.
2
Dated phylogeny suggests early Neolithic origin of Sino-Tibetan languages.年代久远的系统发生关系表明汉藏语系起源于新石器时代早期。
Sci Rep. 2020 Nov 27;10(1):20792. doi: 10.1038/s41598-020-77404-4.
3
Dated language phylogenies shed light on the ancestry of Sino-Tibetan.年代语言谱系揭示了汉藏语系的起源。
Proc Natl Acad Sci U S A. 2019 May 21;116(21):10317-10322. doi: 10.1073/pnas.1817972116. Epub 2019 May 6.
4
Phylogenetic evidence for Sino-Tibetan origin in northern China in the Late Neolithic.中国北方新石器时代晚期藏汉同源的进化证据。
Nature. 2019 May;569(7754):112-115. doi: 10.1038/s41586-019-1153-z. Epub 2019 Apr 24.
5
Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics.跨语言数据格式,促进比较语言学中的数据共享和再利用。
Sci Data. 2018 Oct 16;5:180205. doi: 10.1038/sdata.2018.205.
6
Mapping the origins and expansion of the Indo-European language family.绘制印欧语系的起源和扩张图。
Science. 2012 Aug 24;337(6097):957-60. doi: 10.1126/science.1219669.
7
WEIRD languages have misled us, too.奇怪的语言也误导了我们。
Behav Brain Sci. 2010 Jun;33(2-3):103. doi: 10.1017/S0140525X1000018X. Epub 2010 Jun 15.