Department of Speech and Hearing Sciences, University College Cork, Cork, Ireland.
Centre for Language and Speech Technology, Radboud University, Nijmegen, The Netherlands.
Clin Linguist Phon. 2022 Mar 4;36(2-3):102-110. doi: 10.1080/02699206.2021.1913514. Epub 2021 Apr 23.
Corpora of speech of individuals with communication disorders (CSD) are invaluable resources for education and research, but they are costly and hard to build and difficult to share for various reasons. DELAD, which means 'shared' in Swedish, is a project initiated by Professors Nicole Müller and Martin Ball in 2015 that aims to address this issue by establishing a platform for researchers to share datasets of speech disorders with interested audiences. To date four workshops have been held, where selected participants, covering various expertise including researchers in clinical phonetics and linguistics, speech and language therapy, infrastructure specialists, and ethics and legal specialists, participated to discuss relevant issues in setting up such an archive. Positive and steady progress has been made since 2015, including refurbishing the DELAD website (http://delad.net/) with information and application forms for researchers to join and share their datasets and linking with the CLARIN K-Centre for Atypical Communication Expertise (https://ace.ruhosting.nl/) where CSD can be hosted and accessed through the CLARIN B-Centres, The Language Archive (https://tla.mpi.nl/tools/tla-tools/) and TalkBank (https://talkbank.org/). The latest workshop, which was funded by CLARIN (Common Language Resources and Technology Infrastructure) was held as an online event in January 2021 on topics including Data Protection Impact Assessments, reviewing changes in ethics perspectives in academia on sharing CSD, and voice conversion as a mean to pseudonomise speech. This paper reports the latest progress of DELAD and discusses the directions for further advance of the initiative, with information on how researchers can contribute to the repository.
言语障碍者语料库(CSD)是教育和研究的宝贵资源,但由于各种原因,它们的建立成本高、难度大且难以共享。DELAD 在瑞典语中是“共享”的意思,它是由 Nicole Müller 和 Martin Ball 教授于 2015 年发起的一个项目,旨在通过建立一个平台,使研究人员能够与感兴趣的受众共享言语障碍数据集,来解决这个问题。迄今为止,已经举办了四届研讨会,来自不同专业领域的选定参与者参加了研讨会,包括临床语音学和语言学、言语和语言治疗、基础设施专家以及伦理和法律专家,讨论了建立这样一个档案库的相关问题。自 2015 年以来,该项目取得了积极而稳定的进展,包括翻新了 DELAD 网站(http://delad.net/),提供了研究人员加入和共享数据集的信息和申请表,并与 CLARIN K-中心(https://ace.ruhosting.nl/)建立了联系,在该中心可以托管 CSD,并通过 CLARIN B-中心、The Language Archive(https://tla.mpi.nl/tools/tla-tools/)和 TalkBank(https://talkbank.org/)访问 CSD。最近的一次研讨会是由 CLARIN(通用语言资源和技术基础设施)资助的,于 2021 年 1 月以在线形式举行,主题包括数据保护影响评估、审查学术界在共享 CSD 方面的伦理观点变化,以及语音转换作为一种匿名化言语的方法。本文报告了 DELAD 的最新进展,并讨论了该倡议进一步推进的方向,以及研究人员如何为该存储库做出贡献的信息。