Di Tullio Ronald W, Wei Linran, Balasubramanian Vijay
David Rittenhouse Laboratory, Department of Physics and Astronomy, University of Pennsylvania, USA.
Computational Neuroscience Initiative, University of Pennsylvania, USA.
bioRxiv. 2024 Jul 2:2024.06.20.599962. doi: 10.1101/2024.06.20.599962.
We propose that listeners can use temporal regularities - spectro-temporal correlations that change smoothly over time - to discriminate animal vocalizations within and between species. To test this idea, we used Slow Feature Analysis (SFA) to find the most temporally regular components of vocalizations from birds (blue jay, house finch, American yellow warbler, and great blue heron), humans (English speakers), and rhesus macaques. We projected vocalizations into the learned feature space and tested intra-class (same speaker/species) and inter-class (different speakers/species) auditory discrimination by a trained classifier. We found that: 1) Vocalization discrimination was excellent (> 95%) in all cases; 2) Performance depended primarily on the ~10 most temporally regular features; 3) Most vocalizations are dominated by ~10 features with high temporal regularity; and 4) These regular features are highly correlated with the most predictable components of animal sounds.
我们提出,听众可以利用时间规律——即随时间平滑变化的频谱-时间相关性——来区分物种内部和物种之间的动物叫声。为了验证这一想法,我们使用慢特征分析(SFA)来找出鸟类(蓝鸦、家朱雀、美洲黄莺和大蓝鹭)、人类(说英语者)和恒河猴叫声中时间规律最强的成分。我们将叫声投影到所学的特征空间中,并通过训练有素的分类器测试类内(同一说话者/物种)和类间(不同说话者/物种)的听觉辨别能力。我们发现:1)在所有情况下,叫声辨别准确率都很高(>95%);2)辨别性能主要取决于约10个时间规律最强的特征;3)大多数叫声由约10个具有高时间规律性的特征主导;4)这些规律特征与动物声音中最可预测的成分高度相关。