Lamothe Charly, Obliger-Debouche Manon, Best Paul, Trapeau Régis, Ravel Sabrina, Artières Thierry, Marxer Ricard, Belin Pascal
La Timone Neuroscience Institute UMR 7289, CNRS, Aix-Marseille University, Marseille, France.
Laboratoire d'Informatique et Systèmes UMR 7020, CNRS, Aix-Marseille University, Marseille, France.
Sci Data. 2025 May 13;12(1):782. doi: 10.1038/s41597-025-04951-8.
Non-human primates, our closest relatives, use a wide range of complex vocal signals for communication within their species. Previous research on marmoset (Callithrix jacchus) vocalizations has been limited by sampling rates not covering the whole hearing range and insufficient labeling for advanced analyses using Deep Neural Networks (DNNs). Here, we provide a database of common marmoset vocalizations, which were continuously recorded with a sampling rate of 96 kHz from an animal holding facility housing simultaneously ~20 marmosets in three cages. The dataset comprises more than 800,000 files, amounting to 253 hours of data collected over 40 months. Each recording lasts a few seconds and captures the marmosets' social vocalizations, encompassing their entire known vocal repertoire during the experimental period. Around 215,000 calls are annotated with the vocalization type. We offer a trained classifier to assist future investigations. Finally, we validated our dataset by sampling 700 representative recordings and cross-examining them with four experts.
非人灵长类动物是与我们亲缘关系最近的动物,它们会使用各种各样复杂的声音信号在物种内部进行交流。先前对狨猴(绢毛猴)发声的研究受到采样率的限制,采样率无法覆盖整个听觉范围,并且标记不足,无法使用深度神经网络(DNN)进行高级分析。在这里,我们提供了一个普通狨猴发声的数据库,这些发声是在一个动物饲养设施中以96kHz的采样率连续记录的,该设施同时在三个笼子里饲养了约20只狨猴。该数据集包含超过80万个文件,总计253小时的数据,这些数据是在40个月内收集的。每次记录持续几秒钟,捕捉狨猴的社交发声,涵盖了实验期间它们所有已知的发声曲目。大约21.5万个叫声被标注了发声类型。我们提供一个经过训练的分类器以协助未来的研究。最后,我们通过对700个代表性记录进行采样并与四位专家进行交叉检验,验证了我们的数据集。