Department of Computer Science, Duke University, Durham, United States.
Center for Cognitive Neurobiology, Duke University, Durham, United States.
Elife. 2021 May 14;10:e67855. doi: 10.7554/eLife.67855.
Increases in the scale and complexity of behavioral data pose an increasing challenge for data analysis. A common strategy involves replacing entire behaviors with small numbers of handpicked, domain-specific features, but this approach suffers from several crucial limitations. For example, handpicked features may miss important dimensions of variability, and correlations among them complicate statistical testing. Here, by contrast, we apply the variational autoencoder (VAE), an unsupervised learning method, to learn features directly from data and quantify the vocal behavior of two model species: the laboratory mouse and the zebra finch. The VAE converges on a parsimonious representation that outperforms handpicked features on a variety of common analysis tasks, enables the measurement of moment-by-moment vocal variability on the timescale of tens of milliseconds in the zebra finch, provides strong evidence that mouse ultrasonic vocalizations do not cluster as is commonly believed, and captures the similarity of tutor and pupil birdsong with qualitatively higher fidelity than previous approaches. In all, we demonstrate the utility of modern unsupervised learning approaches to the quantification of complex and high-dimensional vocal behavior.
行为数据的规模和复杂性不断增加,给数据分析带来了越来越大的挑战。一种常见的策略是用少量手工挑选的、特定于领域的特征来替代整个行为,但这种方法存在几个关键的局限性。例如,手工挑选的特征可能会错过重要的可变性维度,并且它们之间的相关性会使统计检验复杂化。相比之下,我们在这里应用了变分自动编码器 (VAE),这是一种无监督学习方法,从数据中直接学习特征,并量化了两种模式生物的发声行为:实验鼠和斑胸草雀。VAE 收敛于一种简洁的表示,在各种常见的分析任务中表现优于手工挑选的特征,能够在斑马雀的几十毫秒时间尺度上测量每一刻的发声可变性,提供了有力的证据表明,老鼠的超声波叫声并不像通常认为的那样聚类,并且比以前的方法更准确地捕捉到了导师和学生鸟类歌声的相似性。总之,我们展示了现代无监督学习方法在量化复杂和高维发声行为方面的实用性。