利用老年人长期护理访谈数据开发自动语音识别模型。

The development of an automatic speech recognition model using interview data from long-term care for older adults.

机构信息

Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Faculty of Health Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands.

The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands.

出版信息

J Am Med Inform Assoc. 2023 Feb 16;30(3):411-417. doi: 10.1093/jamia/ocac241.

DOI:10.1093/jamia/ocac241

PMID:36495570

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9933064/

Abstract

OBJECTIVE

In long-term care (LTC) for older adults, interviews are used to collect client perspectives that are often recorded and transcribed verbatim, which is a time-consuming, tedious task. Automatic speech recognition (ASR) could provide a solution; however, current ASR systems are not effective for certain demographic groups. This study aims to show how data from specific groups, such as older adults or people with accents, can be used to develop an effective ASR.

MATERIALS AND METHODS

An initial ASR model was developed using the Mozilla Common Voice dataset. Audio and transcript data (34 h) from interviews with residents, family, and care professionals on quality of care were used. Interview data were continuously processed to reduce the word error rate (WER).

RESULTS

Due to background noise and mispronunciations, an initial ASR model had a WER of 48.3% on interview data. After finetuning using interview data, the average WER was reduced to 24.3%. When tested on speech data from the interviews, a median WER of 22.1% was achieved, with residents displaying the highest WER (22.7%). The resulting ASR model was at least 6 times faster than manual transcription.

DISCUSSION

The current method decreased the WER substantially, verifying its efficacy. Moreover, using local transcription of audio can be beneficial to the privacy of participants.

CONCLUSIONS

The current study shows that interview data from LTC for older adults can be effectively used to improve an ASR model. While the model output does still contain some errors, researchers reported that it saved much time during transcription.

摘要

目的

在老年人长期护理（LTC）中，访谈用于收集客户观点，这些观点通常会被逐字记录和转录，这是一项耗时且乏味的任务。自动语音识别（ASR）可以提供一种解决方案；然而，当前的 ASR 系统对于某些人群并不有效。本研究旨在展示如何使用特定群体的数据，例如老年人或带有口音的人，来开发有效的 ASR。

材料和方法

使用 Mozilla Common Voice 数据集开发了初始 ASR 模型。使用居民、家庭和护理专业人员对护理质量的访谈的音频和转录数据（34 小时）。访谈数据不断进行处理以降低字错误率（WER）。

结果

由于背景噪音和发音错误，初始 ASR 模型在访谈数据上的 WER 为 48.3%。使用访谈数据进行微调后，平均 WER 降低到 24.3%。在测试访谈中的语音数据时，实现了中位数 WER 为 22.1%，其中居民的 WER 最高（22.7%）。由此产生的 ASR 模型的速度至少比手动转录快 6 倍。

讨论

目前的方法大大降低了 WER，验证了其有效性。此外，使用本地音频转录对于参与者的隐私也可能是有益的。

结论

目前的研究表明，老年人长期护理中的访谈数据可以有效地用于改进 ASR 模型。虽然模型输出仍然包含一些错误，但研究人员报告说它在转录过程中节省了大量时间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5197/9933064/1385f4d4b67f/ocac241f1.jpg

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

利用老年人长期护理访谈数据开发自动语音识别模型。

The development of an automatic speech recognition model using interview data from long-term care for older adults.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

DISCUSSION

CONCLUSIONS

目的

材料和方法

结果

讨论

结论

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

利用老年人长期护理访谈数据开发自动语音识别模型。

The development of an automatic speech recognition model using interview data from long-term care for older adults.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

DISCUSSION

CONCLUSIONS

目的

材料和方法

结果

讨论

结论

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献