Audio Analysis Lab, Department of Architecture, Design and Media Technology, Aalborg University, Aalborg, Denmark.
J Acoust Soc Am. 2013 May;133(5):3062-71. doi: 10.1121/1.4799004.
Speech enhancement and separation algorithms sometimes employ a two-stage processing scheme, wherein the signal is first mapped to an intermediate low-dimensional parametric description after which the parameters are mapped to vectors in codebooks trained on, for example, individual noise-free sources using a vector quantizer. To obtain accurate parameters, one must employ a good estimator in finding the parameters of the intermediate representation, like a maximum likelihood estimator. This leaves some unanswered questions, however, like what metrics to use in the subsequent vector quantization process and how to systematically derive them. This paper aims at answering these questions. Metrics for this are presented and derived, and their use is exemplified on a number of different signal models by deriving closed-form expressions. The metrics essentially take into account in the vector quantization process that some parameters may have been estimated more accurately than others and that there may be dependencies between the estimation errors.
语音增强和分离算法有时采用两阶段处理方案,其中信号首先被映射到中间低维参数描述,然后使用矢量量化器将参数映射到基于例如单个无噪声源训练的代码书中的向量。为了获得准确的参数,必须在寻找中间表示的参数时使用良好的估计器,例如最大似然估计器。然而,这留下了一些未解决的问题,例如在后续的矢量量化过程中使用什么度量标准以及如何系统地推导它们。本文旨在回答这些问题。为此提出并推导了度量标准,并通过推导出闭式表达式,在许多不同的信号模型上对其进行了示例。这些度量标准在矢量量化过程中实质上考虑到,某些参数的估计可能比其他参数更准确,并且估计误差之间可能存在依赖性。