Awan Shaheen N, Bahr Ruth, Watts Stephanie, Boyer Micah, Budinsky Robert, Bensoussan Yael
School of Communication Sciences & Disorders, University of Central Florida, Orlando, Florida.
Department of Communication Sciences & Disorders, University of South Florida, Tampa, Florida.
J Voice. 2024 Sep 20. doi: 10.1016/j.jvoice.2024.08.029.
As part of a larger goal to create best practices for voice data collection to fuel voice artificial intelligence (AI) research, the objective of this study was to investigate the ability of readily available iOS and Android tablets with and without low-cost headset microphones to produce recordings and subsequent acoustic measures of voice comparable to "research quality" instrumentation.
Recordings of 24 sustained vowel samples representing a wide range of typical and disordered voices were played via a head-and-torso model and recorded using a research quality standard microphone/preamplifier/audio interface. Acoustic measurements from the standard were compared with two popular tablets using their built-in microphones and with low-cost headset microphones at different distances from the mouth.
Voice measurements obtained via tablets + headset microphones close to the mouth (2.5 and 5 cm) strongly correlated (r's > 0.90) with the research standard and resulted in no significant differences for measures of vocal frequency and perturbation. In contrast, voice measurements obtained using the tablets' built-in microphones at typical reading distances (30 and 45 cm) tended to show substantial variability in measurement, greater mean differences in voice measurements, and relatively poorer correlations vs the standard.
Findings from this study support preliminary recommendations from the Bridge2AI-Voice Consortium recommending the use of smartphones paired with low-cost headset microphones as adequate methods of recording for large-scale voice data collection from a variety of clinical and nonclinical settings. Compared with recording using a tablet direct, a headset microphone controls for recording distance and reduces the effects of background noise, resulting in decreased variability in recording quality.
Data supporting the results reported in this article may be obtained upon request from the contact author.
作为创建语音数据收集最佳实践以推动语音人工智能(AI)研究这一更大目标的一部分,本研究的目的是调查配备和不配备低成本头戴式麦克风的现成iOS和安卓平板电脑进行录音以及随后生成与“研究质量”仪器相当的语音声学测量值的能力。
通过头和躯干模型播放代表各种典型和紊乱语音的24个持续元音样本的录音,并使用研究质量标准麦克风/前置放大器/音频接口进行录制。将标准测量的声学数据与两款流行平板电脑使用其内置麦克风以及与在距嘴不同距离处的低成本头戴式麦克风进行比较。
通过靠近嘴巴(2.5厘米和5厘米)的平板电脑 + 头戴式麦克风获得的语音测量值与研究标准高度相关(r值>0.90),并且在语音频率和微扰测量方面没有显著差异。相比之下,在典型阅读距离(30厘米和45厘米)使用平板电脑内置麦克风获得的语音测量值往往显示出测量中的大量变异性、语音测量中的更大平均差异以及与标准相比相对较差的相关性。
本研究结果支持Bridge2AI - Voice联盟的初步建议,即推荐使用配备低成本头戴式麦克风的智能手机作为从各种临床和非临床环境中进行大规模语音数据收集的合适录音方法。与直接使用平板电脑录音相比,头戴式麦克风可控制录音距离并减少背景噪声的影响,从而降低录音质量的变异性。
支持本文报告结果的数据可应联系作者的要求获取。