• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

发表演讲:一个带有一些基准的芬兰语口语大规模语料库。

Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks.

作者信息

Moisio Anssi, Porjazovski Dejan, Rouhe Aku, Getman Yaroslav, Virkkunen Anja, AlGhezi Ragheb, Lennes Mietta, Grósz Tamás, Lindén Krister, Kurimo Mikko

机构信息

Department of Signal Processing and Acoustics, Aalto University, Espoo, Finland.

Department of Digital Humanities, University of Helsinki, Helsinki, Finland.

出版信息

Lang Resour Eval. 2022 Aug 9:1-33. doi: 10.1007/s10579-022-09606-3.

DOI:10.1007/s10579-022-09606-3
PMID:35965738
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9362968/
Abstract

The Donate Speech campaign has so far succeeded in gathering approximately 3600 h of ordinary, colloquial Finnish speech into the () corpus. The corpus includes over twenty thousand speakers from all the regions of Finland and from all age brackets. The primary goals of the collection were to create a representative, large-scale resource to study spontaneous spoken Finnish and to accelerate the development of language technology and speech-based services. In this paper, we present the collection process and the collected corpus, and showcase its versatility through multiple use cases. The evaluated use cases include: automatic speech recognition of spontaneous speech, detection of age, gender, dialect and topic and metadata analysis. We provide benchmarks for the use cases, as well downloadable, trained baseline systems with open-source code for reproducibility. One further use case is to verify the metadata and transcripts given in this corpus itself, and to suggest artificial metadata and transcripts for the part of the corpus where it is missing.

摘要

到目前为止,“捐赠语音”活动已成功收集了约3600小时的普通芬兰口语,并将其纳入()语料库。该语料库涵盖了来自芬兰所有地区、各个年龄段的两万多名说话者。收集的主要目标是创建一个具有代表性的大规模资源,用于研究芬兰语自然口语,并加速语言技术和语音服务的发展。在本文中,我们介绍了收集过程和所收集的语料库,并通过多个用例展示了其多功能性。评估的用例包括:自然口语的自动语音识别、年龄、性别、方言和主题检测以及元数据分析。我们为这些用例提供了基准,以及带有开源代码的可下载训练基线系统,以实现可重复性。另一个用例是验证该语料库本身给出的元数据和转录本,并为语料库中缺失的部分建议人工元数据和转录本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/eca78dd89de9/10579_2022_9606_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/18cdb6b2eaba/10579_2022_9606_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/cb6383299217/10579_2022_9606_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/74e2fcc3a01c/10579_2022_9606_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/e7640ade13c4/10579_2022_9606_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/520554965e0d/10579_2022_9606_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/974c525bb87a/10579_2022_9606_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/c0b7f30980dc/10579_2022_9606_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/31235dfbbfab/10579_2022_9606_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/eca78dd89de9/10579_2022_9606_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/18cdb6b2eaba/10579_2022_9606_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/cb6383299217/10579_2022_9606_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/74e2fcc3a01c/10579_2022_9606_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/e7640ade13c4/10579_2022_9606_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/520554965e0d/10579_2022_9606_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/974c525bb87a/10579_2022_9606_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/c0b7f30980dc/10579_2022_9606_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/31235dfbbfab/10579_2022_9606_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51e1/9362968/eca78dd89de9/10579_2022_9606_Fig9_HTML.jpg

相似文献

1
Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks.发表演讲:一个带有一些基准的芬兰语口语大规模语料库。
Lang Resour Eval. 2022 Aug 9:1-33. doi: 10.1007/s10579-022-09606-3.
2
Finnish parliament ASR corpus: Analysis, benchmarks and statistics.芬兰议会ASR语料库:分析、基准与统计数据。
Lang Resour Eval. 2023 Mar 27:1-26. doi: 10.1007/s10579-023-09650-7.
3
Clearing the Transcription Hurdle in Dialect Corpus Building: The Corpus of Southern Dutch Dialects as Case Study.跨越方言语料库构建中的转录障碍:以荷兰南方方言语料库为例
Front Artif Intell. 2020 Apr 15;3:10. doi: 10.3389/frai.2020.00010. eCollection 2020.
4
Automatic discrimination of emotion from spoken Finnish.从芬兰语口语中自动辨别情感。
Lang Speech. 2004;47(Pt 4):383-412. doi: 10.1177/00238309040470040301.
5
Map Task Corpus of Heritage BCMS spoken by second-generation speakers in Switzerland.瑞士第二代使用者所说的卑诗省传统医疗口语地图任务语料库。
Lang Resour Eval. 2023 Feb 22:1-38. doi: 10.1007/s10579-023-09634-7.
6
A speech corpus of Quechua Collao for automatic dimensional emotion recognition.科拉奥克丘亚语口语语料库用于自动维度情感识别。
Sci Data. 2022 Dec 24;9(1):778. doi: 10.1038/s41597-022-01855-9.
7
Racial disparities in automated speech recognition.种族差异与自动化语音识别。
Proc Natl Acad Sci U S A. 2020 Apr 7;117(14):7684-7689. doi: 10.1073/pnas.1915768117. Epub 2020 Mar 23.
8
The Nationwide Speech Project: A new corpus of American English dialects.全国性言语项目:美国英语方言的一个新语料库。
Speech Commun. 2006 Jun 1;48(6):633-644. doi: 10.1016/j.specom.2005.09.010.
9
Gigant-KTTS dataset: Towards building an extensive gigant dataset for Kurdish text-to-speech systems.吉甘特-KTTS数据集:致力于构建一个用于库尔德语语音合成系统的大型数据集。
Data Brief. 2024 Jul 14;55:110753. doi: 10.1016/j.dib.2024.110753. eCollection 2024 Aug.
10
The voice as a material clue: a new forensic Algerian Corpus.作为物质线索的声音:一个新的阿尔及利亚法医语料库。
Multimed Tools Appl. 2023 Mar 15:1-19. doi: 10.1007/s11042-023-14412-2.

本文引用的文献

1
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.