在一个新数据集中识别语言：ASMR低语语音。

Identifying languages in a novel dataset: ASMR-whispered speech.

作者信息

Song Meishu, Yang Zijiang, Parada-Cabaleiro Emilia, Jing Xin, Yamamoto Yoshiharu, Schuller Björn

机构信息

Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany.

Educational Physiology Laboratory, The University of Tokyo, Tokyo, Japan.

出版信息

Front Neurosci. 2023 Jun 15;17:1120311. doi: 10.3389/fnins.2023.1120311. eCollection 2023.

DOI:10.3389/fnins.2023.1120311

PMID:37397449

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10308374/

Abstract

INTRODUCTION

The Autonomous Sensory Meridian Response (ASMR) is a combination of sensory phenomena involving electrostatic-like tingling sensations, which emerge in response to certain stimuli. Despite the overwhelming popularity of ASMR in the social media, no open source databases on ASMR related stimuli are yet available, which makes this phenomenon mostly inaccessible to the research community; thus, almost completely unexplored. In this regard, we present the ASMR Whispered-Speech (ASMR-WS) database.

METHODS

ASWR-WS is a novel database on whispered speech, specifically tailored to promote the development of ASMR-like unvoiced Language Identification (unvoiced-LID) systems. The ASMR-WS database encompasses 38 videos-for a total duration of 10 h and 36 min-and includes seven target languages (Chinese, English, French, Italian, Japanese, Korean, and Spanish). Along with the database, we present baseline results for unvoiced-LID on the ASMR-WS database.

RESULTS

Our best results on the seven-class problem, based on segments of 2s length, and on a CNN classifier and MFCC acoustic features, achieved 85.74% of unweighted average recall and 90.83% of accuracy.

DISCUSSION

For future work, we would like to focus more deeply on the duration of speech samples, as we see varied results with the combinations applied herein. To enable further research in this area, the ASMR-WS database, as well as the partitioning considered in the presented baseline, is made accessible to the research community.

摘要

引言

自主感觉经络反应（ASMR）是一种感觉现象的组合，涉及类似静电的刺痛感，它会因某些刺激而出现。尽管ASMR在社交媒体上广受欢迎，但尚未有与ASMR相关刺激的开源数据库，这使得研究界大多无法接触到这一现象，因此几乎完全未被探索。在这方面，我们展示了ASMR低语语音（ASMR-WS）数据库。

方法

ASWR-WS是一个关于低语语音的新型数据库，专门为促进类似ASMR的无声语言识别（无声-LID）系统的发展而定制。ASMR-WS数据库包含38个视频，总时长为10小时36分钟，包括七种目标语言（中文、英文、法文、意大利文、日文、韩文和西班牙文）。除了该数据库，我们还展示了在ASMR-WS数据库上无声-LID的基线结果。