Suppr超能文献

众包评估与咽腔功能障碍相关的语音质量。

Crowdsourcing to Assess Speech Quality Associated With Velopharyngeal Dysfunction.

机构信息

Speech & Language Services, 7274Seattle Children's Hospital, Seattle, WA, USA.

Division of Plastic Surgery, Department of Surgery, 21617University of Washington, Seattle, WA, USA.

出版信息

Cleft Palate Craniofac J. 2021 Jan;58(1):25-34. doi: 10.1177/1055665620948770. Epub 2020 Aug 18.

Abstract

OBJECTIVE

To assess crowdsourced responses in the evaluation of speech outcomes in children with velopharyngeal dysfunction (VPD).

DESIGN

Fifty deidentified speech samples were compiled. Multiple pairwise comparisons obtained by crowdsourcing were used to produce a rank order of speech quality. Ratings of overall and specific speech characteristics were also collected. Twelve speech-language pathologists (SLPs) who specialize in VPD were asked to complete the same tasks. Crowds and experts completed each task on 2 separate occasions at least 1 week apart.

SETTING

On-line crowdsourcing platform.

PARTICIPANTS

Crowdsource raters were anonymous and at least 18 years of age, North American English speakers with self-reported normal hearing. Speech-language pathologists were recruited from multiple cleft/craniofacial teams.

INTERVENTIONS

None.

MAIN OUTCOME MEASURE(S): Correlation of repeated assessments and comparison of crowd and SLP assessments.

RESULTS

We obtained 6331 lay person assessments that met inclusion criteria via crowdsourcing within 8 hours. The crowds provided reproducible Elo rankings of speech quality, ρ(48) = .89; <.0001, and consistent ratings of intelligibility and acceptability (intraclass correlation coefficient [ICC] = .87 and .92) on repeated assessments. There was a significant correlation of those crowd rankings, (10) = .86; = .0003, and ratings (ICC = .75 and .79) with those of SLPs. The correlation of more specific speech characteristics by the crowds and SLPs was moderate to weak (ICC < 0.65).

CONCLUSIONS

Crowdsourcing shows promise as a rapid way to obtain large numbers of speech assessments. Reliability of repeated assessments was acceptable. Large groups of naive raters yield comparable evaluations of overall speech acceptability, intelligibility, and quality, but are not consistent with expert raters for specific speech characteristics such as resonance and nasal air emission.

摘要

目的

评估众包在评估腭咽功能障碍(VPD)儿童语音结果中的反应。

设计

编制了 50 个匿名语音样本。通过众包获得的多项两两比较用于生成语音质量的等级顺序。还收集了对整体和特定语音特征的评分。12 名专门从事 VPD 的言语语言病理学家(SLP)被要求完成相同的任务。众包人员和专家在至少相隔 1 周的 2 个不同时间完成每个任务。

设置

在线众包平台。

参与者

众包评分者匿名,年龄至少 18 岁,为北美英语使用者,自述听力正常。言语语言病理学家是从多个腭裂/颅面团队中招募的。

干预措施

无。

主要观察指标

重复评估的相关性以及众包和 SLP 评估的比较。

结果

我们通过众包在 8 小时内获得了符合纳入标准的 6331 名非专业人士评估。众包人员提供了可重复的语音质量 Elo 排名,ρ(48)=.89; <.0001,以及可重复评估的可理解性和可接受性的一致评分(组内相关系数 [ICC] =.87 和.92)。众包排名的相关性,(10)=.86; =.0003,以及评分(ICC =.75 和.79)与 SLP 的相关性显著。众包人员和 SLP 对更具体语音特征的相关性为中度至弱(ICC <.65)。

结论

众包作为一种快速获取大量语音评估的方法具有很大的潜力。重复评估的可靠性是可以接受的。大量的无经验评分者对整体语音可接受性、可理解性和质量产生了可比的评估,但与特定语音特征(如共鸣和鼻腔气流泄漏)的专家评分者不一致。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验