School of Communication Sciences and Disorders, University of Central Florida, Orlando.
Department of Communication Sciences & Disorders, University of South Florida, Tampa.
J Speech Lang Hear Res. 2024 Jun 6;67(6):1712-1730. doi: 10.1044/2024_JSLHR-23-00759. Epub 2024 May 15.
The goal of this study was to assess various recording methods, including combinations of high- versus low-cost microphones, recording interfaces, and smartphones in terms of their ability to produce commonly used time- and spectral-based voice measurements.
Twenty-four vowel samples representing a diversity of voice quality deviations and severities from a wide age range of male and female speakers were played via a head-and-thorax model and recorded using a high-cost, research standard GRAS 40AF (GRAS Sound & Vibration) microphone and amplification system. Additional recordings were made using various combinations of headset microphones (AKG C555 L [AKG Acoustics GmbH], Shure SM35-XLR [Shure Incorporated], AVID AE-36 [AVID Products, Inc.]) and audio interfaces (Focusrite Scarlett 2i2 [Focusrite Audio Engineering Ltd.] and PC, Focusrite and smartphone, smartphone via a TRRS adapter), as well as smartphones direct (Apple iPhone 13 Pro, Google Pixel 6) using their built-in microphones. The effect of background noise from four different room conditions was also evaluated. Vowel samples were analyzed for measures of fundamental frequency, perturbation, cepstral peak prominence, and spectral tilt (low vs. high spectral ratio).
Results show that a wide variety of recording methods, including smartphones with and without a low-cost headset microphone, can effectively track the wide range of acoustic characteristics in a diverse set of typical and disordered voice samples. Although significant differences in acoustic measures of voice may be observed, the presence of extremely strong correlations (s > .90) with the recording standard implies a strong linear relationship between the results of different methods that may be used to predict and adjust any observed differences in measurement results.
Because handheld smartphone distance and positioning may be highly variable when used in actual clinical recording situations, smartphone + a low-cost headset microphone is recommended as an affordable recording method that controls mouth-to-microphone distance and positioning and allows both hands to be available for manipulation of the smartphone device.
本研究旨在评估各种录音方法,包括高成本与低成本麦克风、录音接口和智能手机的组合,以评估它们在产生常用时间和频谱语音测量方面的能力。
通过头-胸模型播放 24 个代表来自广泛年龄范围的男性和女性说话者的各种语音质量偏差和严重程度的元音样本,并使用高成本、研究标准的 GRAS 40AF(GRAS 声音与振动)麦克风和放大系统进行录制。使用各种组合的耳机麦克风(AKG C555 L[AKG 声学有限公司]、Shure SM35-XLR[Shure 公司]、AVID AE-36[AVID 产品公司])和音频接口(Focusrite Scarlett 2i2[Focusrite 音频工程有限公司]和 PC、Focusrite 和智能手机,智能手机通过 TRRS 适配器)以及智能手机直接(Apple iPhone 13 Pro、Google Pixel 6)及其内置麦克风进行了额外的录音。还评估了来自四个不同房间条件的背景噪声的影响。对元音样本进行了基频、微扰、倒谱峰突出度和频谱倾斜(低与高光谱比)的测量分析。
结果表明,包括具有和不具有低成本耳机麦克风的智能手机在内的各种录音方法可以有效地跟踪各种典型和失调语音样本中的广泛声学特征。虽然可能会观察到语音声学测量的显著差异,但非常强的相关性(s >.90)与记录标准意味着不同方法之间存在很强的线性关系,这些方法可以用于预测和调整测量结果中的任何观察到的差异。
由于在实际临床记录情况下,智能手机的手持距离和定位可能高度可变,因此推荐使用智能手机+低成本耳机麦克风作为一种经济实惠的录音方法,该方法可控制嘴到麦克风的距离和定位,并允许双手可用于操作智能手机设备。