• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

衰老研究中的口语分析:使用OpenAI的Whisper将人工智能生成的语音转换为文本的有效性。

Spoken Language Analysis in Aging Research: The Validity of AI-Generated Speech to Text Using OpenAI's Whisper.

作者信息

Naffah Ava, Pfeifer Valeria A, Mehl Matthias R

机构信息

Department of Psychology, University of Arizona, Tucson, Arizona, USA.

出版信息

Gerontology. 2025;71(5):417-424. doi: 10.1159/000545244. Epub 2025 Mar 13.

DOI:10.1159/000545244
PMID:40552848
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12188092/
Abstract

INTRODUCTION

Studying what older adults say can provide important insights into cognitive, affective, and social aspects of aging. Available language analysis tools generally require audio-recorded speech to be transcribed into verbatim text, a task that has historically been performed by humans. However, recent advances in AI-based language processing open up the possibility of replacing this time- and resource-intensive task with fully automatic speech to text.

METHODS

This study evaluates the accuracy of two common automatic speech-to-text tools - OpenAI's Whisper and otter.ai - relative to human-corrected transcripts. Based on two speech tasks completed by 238 older adults, we used the Linguistic Inquiry and Word Count (LIWC) to compare language features of text generated by each transcription method. The study further assessed the degree to which manual tagging of filler words (e.g., "like," "well") common in spoken language impacts the validity of the analysis.

RESULTS

The AI-based LIWC features evidenced very high convergence with the LIWC features derived from the human-corrected transcripts (average r = 0.98). Further, the manual tagging of filler words did not impact the validity for all LIWC features except the categories filler words and netspeak.

CONCLUSION

These findings support that Whisper and otter.ai are valuable tools for language analysis in aging research and provide further evidence that automatic speech to text with state-of-the art AI tools is ready for psychological language research.

摘要

引言

研究老年人的言语可以为衰老的认知、情感和社会方面提供重要见解。现有的语言分析工具通常需要将录音语音转录为逐字文本,这项任务历来由人工完成。然而,基于人工智能的语言处理的最新进展开启了用全自动语音转文本取代这项耗时且资源密集型任务的可能性。

方法

本研究评估了两种常见的自动语音转文本工具——OpenAI的Whisper和otter.ai——相对于人工校正转录本的准确性。基于238名老年人完成的两项言语任务,我们使用语言查询与字数统计(LIWC)来比较每种转录方法生成的文本的语言特征。该研究进一步评估了对口语中常见的填充词(如“like”“well”)进行人工标记对分析有效性的影响程度。

结果

基于人工智能的LIWC特征与从人工校正转录本中得出的LIWC特征显示出非常高的一致性(平均r = 0.98)。此外,除了填充词和网络用语类别外,对填充词进行人工标记并未影响所有LIWC特征的有效性。

结论

这些发现支持Whisper和otter.ai是衰老研究中语言分析的有价值工具,并进一步证明使用先进的人工智能工具进行自动语音转文本已可用于心理学语言研究。

相似文献

1
Spoken Language Analysis in Aging Research: The Validity of AI-Generated Speech to Text Using OpenAI's Whisper.衰老研究中的口语分析:使用OpenAI的Whisper将人工智能生成的语音转换为文本的有效性。
Gerontology. 2025;71(5):417-424. doi: 10.1159/000545244. Epub 2025 Mar 13.
2
Interventions for childhood apraxia of speech.儿童言语失用症的干预措施。
Cochrane Database Syst Rev. 2018 May 30;5(5):CD006278. doi: 10.1002/14651858.CD006278.pub3.
3
Speech changes in old age: Methodological considerations for speech-based discrimination of healthy ageing and Alzheimer's disease.老年言语变化:基于言语的健康衰老与阿尔茨海默病鉴别方法学的考虑。
Int J Lang Commun Disord. 2024 Jan-Feb;59(1):13-37. doi: 10.1111/1460-6984.12888. Epub 2023 May 4.
4
Neonatal Nurses' Understanding of the Factors That Enhance and Hinder Early Communication Between Preterm Infants and Their Parents: A Narrative Inquiry Study.新生儿护士对促进和阻碍早产儿与其父母早期沟通因素的理解:一项叙事探究研究。
Int J Lang Commun Disord. 2025 Jul-Aug;60(4):e70093. doi: 10.1111/1460-6984.70093.
5
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
6
A systematic review of speech, language and communication interventions for children with Down syndrome from 0 to 6 years.对0至6岁唐氏综合征儿童言语、语言和沟通干预措施的系统评价。
Int J Lang Commun Disord. 2022 Mar;57(2):441-463. doi: 10.1111/1460-6984.12699. Epub 2022 Feb 22.
7
Using Natural Language Processing to Explore Patient Perspectives on AI Avatars in Support Materials for Patients With Breast Cancer: Survey Study.使用自然语言处理技术探索乳腺癌患者在支持材料中对人工智能化身的看法:调查研究
J Med Internet Res. 2025 Jun 20;27:e70971. doi: 10.2196/70971.
8
Short-Term Memory Impairment短期记忆障碍
9
How ready is speech-to-text for psychological language research? Evaluating the validity of AI-generated English transcripts for analyzing free-spoken responses in younger and older adults.语音转文本技术在心理学语言研究中应用的准备情况如何?评估人工智能生成的英语抄本在分析年轻和老年被试自由口语反应时的有效性。
Behav Res Methods. 2024 Oct;56(7):7621-7631. doi: 10.3758/s13428-024-02440-1. Epub 2024 May 21.
10
Stench of Errors or the Shine of Potential: The Challenge of (Ir)Responsible Use of ChatGPT in Speech-Language Pathology.错误的恶臭还是潜力的光辉:言语病理学中(不)负责任地使用ChatGPT的挑战。
Int J Lang Commun Disord. 2025 Jul-Aug;60(4):e70088. doi: 10.1111/1460-6984.70088.

引用本文的文献

1
Fabla: A voice-based ecological assessment method for securely collecting spoken responses to researcher questions.Fabla:一种基于语音的生态评估方法,用于安全收集对研究人员问题的口头回答。
Behav Res Methods. 2025 Aug 13;57(9):257. doi: 10.3758/s13428-025-02777-1.

本文引用的文献

1
FluencyBank Timestamped: An Updated Data Set for Disfluency Detection and Automatic Intended Speech Recognition.流畅度银行时间戳数据集:用于不流畅检测和自动意图语音识别的更新数据集。
J Speech Lang Hear Res. 2024 Nov 7;67(11):4203-4215. doi: 10.1044/2024_JSLHR-24-00070. Epub 2024 Oct 8.
2
Quantification of Automatic Speech Recognition System Performance on d/Deaf and Hard of Hearing Speech.自动语音识别系统对聋人及听力障碍者语音的性能量化
Laryngoscope. 2025 Jan;135(1):191-197. doi: 10.1002/lary.31713. Epub 2024 Aug 19.
3
How ready is speech-to-text for psychological language research? Evaluating the validity of AI-generated English transcripts for analyzing free-spoken responses in younger and older adults.语音转文本技术在心理学语言研究中应用的准备情况如何?评估人工智能生成的英语抄本在分析年轻和老年被试自由口语反应时的有效性。
Behav Res Methods. 2024 Oct;56(7):7621-7631. doi: 10.3758/s13428-024-02440-1. Epub 2024 May 21.
4
Precision Aging: Applying Precision Medicine to the Field of Cognitive Aging.精准衰老:将精准医学应用于认知衰老领域
Front Aging Neurosci. 2019 Jun 7;11:128. doi: 10.3389/fnagi.2019.00128. eCollection 2019.
5
Age differences in positive feelings and their expression.积极情绪及其表达中的年龄差异。
Psychol Aging. 2017 Nov;32(7):608-620. doi: 10.1037/pag0000200. Epub 2017 Oct 12.
6
A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research.可靠性研究中组内相关系数选择与报告指南
J Chiropr Med. 2016 Jun;15(2):155-63. doi: 10.1016/j.jcm.2016.02.012. Epub 2016 Mar 31.
7
Words of wisdom: language use over the life span.智慧之言:一生的语言运用。
J Pers Soc Psychol. 2003 Aug;85(2):291-301. doi: 10.1037/0022-3514.85.2.291.