Marelli Damián, Baumgartner Robert, Majdak Piotr
School of Electrical Engineering and Computer Science, University of Newcastle, Callaghan, NSW 2308, Australia; Acoustics Research Institute, Austrian Academy of Sciences, Austria (
Acoustics Research Institute, Austrian Academy of Sciences, 1040 Vienna, Austria (
IEEE Trans Audio Speech Lang Process. 2015 Jul 1;23(7):1130-1143.
Head-related transfer functions (HRTFs) describe the acoustic filtering of incoming sounds by the human morphology and are essential for listeners to localize sound sources in virtual auditory displays. Since rendering complex virtual scenes is computationally demanding, we propose four algorithms for efficiently representing HRTFs in subbands, i.e., as an analysis filterbank (FB) followed by a transfer matrix and a synthesis FB. All four algorithms use sparse approximation procedures to minimize the computational complexity while maintaining perceptually relevant HRTF properties. The first two algorithms separately optimize the complexity of the transfer matrix associated to each HRTF for fixed FBs. The other two algorithms jointly optimize the FBs and transfer matrices for complete HRTF sets by two variants. The first variant aims at minimizing the complexity of the transfer matrices, while the second one does it for the FBs. Numerical experiments investigate the latency-complexity trade-off and show that the proposed methods offer significant computational savings when compared with other available approaches. Psychoacoustic localization experiments were modeled and conducted to find a reasonable approximation tolerance so that no significant localization performance degradation was introduced by the subband representation.
头部相关传递函数(HRTFs)描述了人体形态对传入声音的声学滤波作用,对于听众在虚拟听觉显示中定位声源至关重要。由于渲染复杂的虚拟场景在计算上要求很高,我们提出了四种算法,用于在子带中高效表示HRTFs,即作为一个分析滤波器组(FB),后跟一个传递矩阵和一个合成FB。所有四种算法都使用稀疏逼近程序来最小化计算复杂度,同时保持与感知相关的HRTF属性。前两种算法针对固定的FB分别优化与每个HRTF相关的传递矩阵的复杂度。另外两种算法通过两个变体为完整的HRTF集联合优化FB和传递矩阵。第一个变体旨在最小化传递矩阵的复杂度,而第二个则针对FB进行此操作。数值实验研究了延迟 - 复杂度权衡,并表明与其他可用方法相比,所提出的方法在计算上有显著节省。对心理声学定位实验进行了建模和实施,以找到合理的近似容差,从而使子带表示不会导致显著的定位性能下降。