Centre for Digital Music, Queen Mary University of London, London, United Kingdom.
J Acoust Soc Am. 2021 Jul;150(1):202. doi: 10.1121/10.0005509.
The synthesis of convincing acoustic drum sounds remains an open problem. In this paper, a method for analysing and synthesising pitch glide in drums is proposed, whereby the discrete cosine transform (DCT) of an unwindowed drum sound is modelled. This is an extension of the scheme initially proposed by Kirby and Sandler [(2020). Proceedings of the 23rd International Conference on Digital Audio Effects, Vienna, Austria, pp. 155-162], which was able to reproduce key components of drum sounds accurately enough that they could not be distinguished from the reference samples. Here, drum modes were analysed in greater detail for a tom-tom struck at 67 different intensities to investigate their evolution with strike velocity. A clear evolution was observed in the DCT features, and interpolation was used to synthesise the modes of intermediate velocity. These synthesised modes were evaluated objectively through null testing, which showed that a continuous blending of strike velocities could be achieved throughout the data set. An AB listening test was also performed, where 20 participants attempted to distinguish between pairs of real and synthesised sounds. Exactly 50% accuracy was achieved overall, which demonstrates that the synthesised samples were deemed to sound as realistic as genuine samples. These results demonstrate that the DCT representation is a valuable framework for analysis and synthesis of drum sounds. It is also likely that this approach could be applied to other instruments.
令人信服的声学鼓声音的合成仍然是一个未解决的问题。在本文中,提出了一种用于分析和合成鼓音音高滑动的方法,通过对未加窗的鼓音进行离散余弦变换(DCT)建模。这是 Kirby 和 Sandler [(2020)。第 23 届国际数字音频效果会议论文集,维也纳,奥地利,第 155-162 页]最初提出的方案的扩展,该方案能够准确地再现鼓音的关键成分,以至于无法将其与参考样本区分开来。在这里,对不同强度敲击的汤姆鼓进行了更详细的分析,以研究其随敲击速度的演变。在 DCT 特征中观察到了明显的演化,并且使用插值来合成中间速度的模式。通过零测试对这些合成模式进行了客观评估,结果表明可以在整个数据集内实现连续的敲击速度混合。还进行了 AB 听力测试,其中 20 名参与者尝试区分真实和合成声音对。总体上达到了 50%的准确率,这表明合成样本被认为与真实样本一样逼真。这些结果表明,DCT 表示是分析和合成鼓声音的有价值的框架。这种方法也可能适用于其他乐器。