Institute for Medical Engineering, Technische Universität München, Boltzmannstr. 11, 85748, Garching, Germany.
Chair of Complex & Intelligent Systems, Universität Passau, Innstr. 43, 94032, Passau, Germany; ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Eichleitnerstr. 30, 86159, Augsburg, Germany.
Comput Biol Med. 2018 Mar 1;94:106-118. doi: 10.1016/j.compbiomed.2018.01.007. Epub 2018 Jan 31.
Snoring can be excited in different locations within the upper airways during sleep. It was hypothesised that the excitation locations are correlated with distinct acoustic characteristics of the snoring noise. To verify this hypothesis, a database of snore sounds is developed, labelled with the location of sound excitation.
Video and audio recordings taken during drug induced sleep endoscopy (DISE) examinations from three medical centres have been semi-automatically screened for snore events, which subsequently have been classified by ENT experts into four classes based on the VOTE classification. The resulting dataset containing 828 snore events from 219 subjects has been split into Train, Development, and Test sets. An SVM classifier has been trained using low level descriptors (LLDs) related to energy, spectral features, mel frequency cepstral coefficients (MFCC), formants, voicing, harmonic-to-noise ratio (HNR), spectral harmonicity, pitch, and microprosodic features.
An unweighted average recall (UAR) of 55.8% could be achieved using the full set of LLDs including formants. Best performing subset is the MFCC-related set of LLDs. A strong difference in performance could be observed between the permutations of train, development, and test partition, which may be caused by the relatively low number of subjects included in the smaller classes of the strongly unbalanced data set.
A database of snoring sounds is presented which are classified according to their sound excitation location based on objective criteria and verifiable video material. With the database, it could be demonstrated that machine classifiers can distinguish different excitation location of snoring sounds in the upper airway based on acoustic parameters.
在睡眠期间,上呼吸道的不同部位可能会引发打鼾。据推测,激发部位与打鼾噪声的独特声学特征相关。为了验证这一假设,开发了一个包含打鼾声音的数据库,并对声音激发位置进行了标注。
从三个医学中心的药物诱导睡眠内镜(DISE)检查中获取视频和音频记录,半自动筛选打鼾事件,随后由耳鼻喉科专家根据 VOTE 分类将其分为四类。从 219 名患者中包含 828 个打鼾事件的数据集被分为训练集、开发集和测试集。使用与能量、频谱特征、梅尔频率倒谱系数(MFCC)、共振峰、嗓音、谐波噪声比(HNR)、频谱谐和性、音高和微韵律特征相关的低级别描述符(LLD),对支持向量机(SVM)分类器进行了训练。
使用包含共振峰的完整 LLD 集,可实现 55.8%的未加权平均召回率(UAR)。性能最佳的子集是与 MFCC 相关的 LLD 集。在训练、开发和测试分区的排列之间可以观察到性能的显著差异,这可能是由于在强烈不平衡数据集的较小类别中包含的受试者数量相对较少所致。
提出了一个根据客观标准和可验证的视频材料对打鼾声音进行分类的数据库,这些分类是基于声音激发位置进行的。通过该数据库,证明了机器分类器可以根据声学参数区分上呼吸道中不同的打鼾声音激发位置。