Kogan J A, Margoliash D
Department of Organismal Biology and Anatomy, University of Chicago, Illinois 60637, USA.
J Acoust Soc Am. 1998 Apr;103(4):2185-96. doi: 10.1121/1.421364.
The performance of two techniques is compared for automated recognition of bird song units from continuous recordings. The advantages and limitations of dynamic time warping (DTW) and hidden Markov models (HMMs) are evaluated on a large database of male songs of zebra finches (Taeniopygia guttata) and indigo buntings (Passerina cyanea), which have different types of vocalizations and have been recorded under different laboratory conditions. Depending on the quality of recordings and complexity of song, the DTW-based technique gives excellent to satisfactory performance. Under challenging conditions such as noisy recordings or presence of confusing short-duration calls, good performance of the DTW-based technique requires careful selection of templates that may demand expert knowledge. Because HMMs are trained, equivalent or even better performance of HMMs can be achieved based only on segmentation and labeling of constituent vocalizations, albeit with many more training examples than DTW templates. One weakness in HMM performance is the misclassification of short-duration vocalizations or song units with more variable structure (e.g., some calls, and syllables of plastic songs). To address these and other limitations, new approaches for analyzing bird vocalizations are discussed.
对两种技术从连续录音中自动识别鸟鸣单元的性能进行了比较。在一个包含斑胸草雀(Taeniopygia guttata)和靛蓝彩鹀(Passerina cyanea)雄鸟鸣唱的大型数据库上评估了动态时间规整(DTW)和隐马尔可夫模型(HMM)的优缺点,这两种鸟类具有不同类型的发声,且在不同实验室条件下进行了录音。根据录音质量和歌声复杂度,基于DTW的技术表现优异至令人满意。在具有挑战性的条件下,如嘈杂录音或存在令人混淆的短时长叫声时,基于DTW的技术要取得良好性能需要仔细选择模板,这可能需要专业知识。由于HMM是经过训练的,基于组成发声的分割和标记,HMM可以实现同等甚至更好的性能,尽管训练示例比DTW模板多得多。HMM性能的一个弱点是对短时长发声或结构更具变异性的歌声单元(例如,一些叫声和可塑性歌声的音节)的错误分类。为了解决这些及其他限制,讨论了分析鸟鸣的新方法。