Mellinger D K, Clark C W
Cooperative Institute for Marine Resources Studies, Oregon State University, Newport 97365, USA.
J Acoust Soc Am. 2000 Jun;107(6):3518-29. doi: 10.1121/1.429434.
A method is described for the automatic recognition of transient animal sounds. Automatic recognition can be used in wild animal research, including studies of behavior, population, and impact of anthropogenic noise. The method described here, spectrogram correlation, is well-suited to recognition of animal sounds consisting of tones and frequency sweeps. For a sound type of interest, a two-dimensional synthetic kernel is constructed and cross-correlated with a spectrogram of a recording, producing a recognition function--the likelihood at each point in time that the sound type was present. A threshold is applied to this function to obtain discrete detection events, instants at which the sound type of interest was likely to be present. An extension of this method handles the temporal variation commonly present in animal sounds. Spectrogram correlation was compared to three other methods that have been used for automatic call recognition: matched filters, neural networks, and hidden Markov models. The test data set consisted of bowhead whale (Balaena mysticetus) end notes from songs recorded in Alaska in 1986 and 1988. The method had a success rate of about 97.5% on this problem, and the comparison indicated that it could be especially useful for detecting a call type when relatively few (5-200) instances of the call type are known.
本文描述了一种用于自动识别动物瞬态声音的方法。自动识别可用于野生动物研究,包括行为、种群以及人为噪声影响等方面的研究。这里所描述的方法——频谱图相关性,非常适合识别由音调及频率扫描组成的动物声音。对于感兴趣的某一声音类型,构建一个二维合成核,并将其与录音的频谱图进行互相关,从而产生一个识别函数——即声音类型在每个时间点出现的可能性。对该函数应用一个阈值以获得离散的检测事件,即感兴趣的声音类型可能出现的时刻。此方法的一个扩展可处理动物声音中常见的时间变化。将频谱图相关性与其他三种已用于自动叫声识别的方法进行了比较:匹配滤波器、神经网络和隐马尔可夫模型。测试数据集由1986年和1988年在阿拉斯加录制的弓头鲸(Balaena mysticetus)歌声中的结束音符组成。该方法针对此问题的成功率约为97.5%,并且比较结果表明,当已知相对较少(5 - 200)的叫声类型实例时,它对于检测该叫声类型可能特别有用。