• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

K-SPAN:一个韩语表面语音形式和音韵邻接密度统计的词汇数据库。

K-SPAN: A lexical database of Korean surface phonetic forms and phonological neighborhood density statistics.

机构信息

Department of Korean Language and Literature, Korea University, 145 Anam-ro Seongbuk-gu, Seoul, 02841, South Korea.

Laboratoire de Sciences Cognitives et Psycholinguistique (ENS, EHESS, CNRS), Département d'Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29, rue d'Ulm, 75005, Paris, France.

出版信息

Behav Res Methods. 2017 Oct;49(5):1939-1950. doi: 10.3758/s13428-016-0836-8.

DOI:10.3758/s13428-016-0836-8
PMID:28155186
Abstract

This article presents K-SPAN (Korean Surface Phonetics and Neighborhoods), a database of surface phonetic forms and several measures of phonological neighborhood density for 63,836 Korean words. Currently publicly available Korean corpora are limited by the fact that they only provide orthographic representations in Hangeul, which is problematic since phonetic forms in Korean cannot be reliably predicted from orthographic forms. We describe the method used to derive the surface phonetic forms from a publicly available orthographic corpus of Korean, and report on several statistics calculated using this database; namely, segment unigram frequencies, which are compared to previously reported results, along with segment-based and syllable-based neighborhood density statistics for three types of representation: an "orthographic" form, which is a quasi-phonological representation, a "conservative" form, which maintains all known contrasts, and a "modern" form, which represents the pronunciation of contemporary Seoul Korean. These representations are rendered in an ASCII-encoded scheme, which allows users to query the corpus without having to read Korean orthography, and permits the calculation of a wide range of phonological measures.

摘要

本文介绍了 K-SPAN(韩语表面语音和音近词),这是一个包含 63836 个韩语单词的表面语音形式和几个语音近音密度度量的数据库。目前可用的韩语语料库存在一个局限性,即它们只提供韩语的韩文字符拼写形式,这是有问题的,因为韩语的语音形式不能从拼写形式可靠地预测。我们描述了从一个公开的韩语正字法语料库中推导出表面语音形式的方法,并报告了使用该数据库计算的几个统计数据;即,语素的一元频率,与之前报告的结果进行比较,以及基于段和基于音节的三种表示形式的近音密度统计数据:“正字法”形式,这是一种准语音表示形式,“保守”形式,它保留了所有已知的对比,以及“现代”形式,它代表了当代首尔韩语的发音。这些表示形式采用 ASCII 编码方案呈现,允许用户查询语料库,而无需阅读韩语正字法,并允许计算广泛的语音度量。

相似文献

1
K-SPAN: A lexical database of Korean surface phonetic forms and phonological neighborhood density statistics.K-SPAN:一个韩语表面语音形式和音韵邻接密度统计的词汇数据库。
Behav Res Methods. 2017 Oct;49(5):1939-1950. doi: 10.3758/s13428-016-0836-8.
2
Database of word-level statistics for Mandarin Chinese (DoWLS-MAN).汉语词级统计数据库(DoWLS-MAN)。
Behav Res Methods. 2022 Apr;54(2):987-1009. doi: 10.3758/s13428-021-01620-7. Epub 2021 Aug 17.
3
The Malay Lexicon Project: a database of lexical statistics for 9,592 words.马来语词汇项目:包含 9592 个单词的词汇统计数据库。
Behav Res Methods. 2010 Nov;42(4):992-1003. doi: 10.3758/BRM.42.4.992.
4
Phonetic radicals, not phonological coding systems, support orthographic learning via self-teaching in Chinese.拼音偏旁而非音系编码系统支持通过自我教学促进汉字的正字法学习。
Cognition. 2018 Jul;176:184-194. doi: 10.1016/j.cognition.2018.02.025. Epub 2018 Mar 21.
5
The Role of Orthography in Lexical Processing of the Phonological Variants in Second Language.第二语言中语音变体的词汇处理中拼写的作用。
J Psycholinguist Res. 2021 Apr;50(2):437-445. doi: 10.1007/s10936-020-09725-4.
6
Phonological assembly in reading: lexical contribution leads to violation of graphophonological rules.阅读中的语音组合:词汇贡献导致违反字素-语音规则。
Mem Cognit. 1991 Nov;19(6):568-78. doi: 10.3758/bf03197152.
7
From orthography to phonetics: ERP measures of grapheme-to-phoneme conversion mechanisms in reading.从正字法到语音学:阅读中字形到音素转换机制的事件相关电位测量
J Cogn Neurosci. 2004 Mar;16(2):301-17. doi: 10.1162/089892904322984580.
8
CLEARPOND: cross-linguistic easy-access resource for phonological and orthographic neighborhood densities.CLEARPOND:用于语音和正字法邻域密度的跨语言便捷获取资源。
PLoS One. 2012;7(8):e43230. doi: 10.1371/journal.pone.0043230. Epub 2012 Aug 20.
9
Procura-PALavras (P-PAL): A Web-based interface for a new European Portuguese lexical database.Procura-PALavras (P-PAL):一个新的欧洲葡萄牙语词汇数据库的网络界面。
Behav Res Methods. 2018 Aug;50(4):1461-1481. doi: 10.3758/s13428-018-1058-z.
10
PhonItalia: a phonological lexicon for Italian.PhonItalia:一个用于意大利语的语音词典。
Behav Res Methods. 2014 Sep;46(3):872-86. doi: 10.3758/s13428-013-0400-8.

引用本文的文献

1
Syllables and their beginnings have a special role in the mental lexicon.音节及其起始部分在心理词汇中具有特殊作用。
Proc Natl Acad Sci U S A. 2023 Sep 5;120(36):e2215710120. doi: 10.1073/pnas.2215710120. Epub 2023 Aug 28.
2
Database of word-level statistics for Mandarin Chinese (DoWLS-MAN).汉语词级统计数据库(DoWLS-MAN)。
Behav Res Methods. 2022 Apr;54(2):987-1009. doi: 10.3758/s13428-021-01620-7. Epub 2021 Aug 17.