Kim Kwang S, Wang Hantao, Max Ludo
Department of Speech and Hearing Sciences, University of Washington, Seattle.
Haskins Laboratories, New Haven, CT.
J Speech Lang Hear Res. 2020 Aug 10;63(8):2522-2534. doi: 10.1044/2020_JSLHR-19-00419. Epub 2020 Jul 8.
Purpose Various aspects of speech production related to auditory-motor integration and learning have been examined through auditory feedback perturbation paradigms in which participants' acoustic speech output is experimentally altered and played back via earphones/headphones "in real time." Scientific rigor requires high precision in determining and reporting the involved hardware and software latencies. Many reports in the literature, however, are not consistent with the minimum achievable latency for a given experimental setup. Here, we focus specifically on this methodological issue associated with implementing real-time auditory feedback perturbations, and we offer concrete suggestions for increased reproducibility in this particular line of work. Method Hardware and software latencies as well as total feedback loop latency were measured for formant perturbation studies with the Audapter software. Measurements were conducted for various audio interfaces, desktop and laptop computers, and audio drivers. An approach for lowering Audapter's software latency through nondefault parameter specification was also tested. Results Oft-overlooked hardware-specific latencies were not negligible for some of the tested audio interfaces (adding up to 15 ms). Total feedback loop latencies (including both hardware and software latency) were also generally larger than claimed in the literature. Nondefault parameter values can improve Audapter's own processing latency without negative impact on formant tracking. Conclusions Audio interface selection and software parameter optimization substantially affect total feedback loop latency. Thus, the actual total latency (hardware plus software) needs to be correctly measured and described in all published reports. Future speech research with "real-time" auditory feedback perturbations should increase scientific rigor by minimizing this latency.
目的 通过听觉反馈扰动范式对与听觉 - 运动整合及学习相关的言语产生的各个方面进行了研究,在该范式中,参与者的声学言语输出会被实验性改变并通过耳机“实时”回放。科学严谨性要求在确定和报告所涉及的硬件及软件延迟时具备高精度。然而,文献中的许多报告与给定实验设置可实现的最小延迟并不一致。在此,我们特别关注与实施实时听觉反馈扰动相关的这一方法学问题,并针对这一特定工作领域提高可重复性提供具体建议。方法 使用Audapter软件对共振峰扰动研究中的硬件和软件延迟以及总反馈回路延迟进行了测量。针对各种音频接口、台式机和笔记本电脑以及音频驱动程序进行了测量。还测试了一种通过非默认参数设置来降低Audapter软件延迟的方法。结果 对于某些测试的音频接口,常被忽视的特定硬件延迟不可忽略(总计达15毫秒)。总反馈回路延迟(包括硬件和软件延迟)通常也比文献中声称的要大。非默认参数值可以改善Audapter自身的处理延迟,而不会对共振峰跟踪产生负面影响。结论 音频接口的选择和软件参数优化会显著影响总反馈回路延迟。因此,所有已发表的报告都需要正确测量和描述实际的总延迟(硬件加软件)。未来采用“实时”听觉反馈扰动的言语研究应通过最小化这种延迟来提高科学严谨性。