Department of Biomedical Engineering, SRM Institute of Science and Technology, (Deemed to be University under section 3 of UGC Act 1956), Kattankulathur, 603203, Tamil Nadu, India.
School of Computer and Communication Engineering, Universiti Malaysia Perlis (UniMAP), Campus Pauh Putra, Perlis 02600, Malaysia.
Comput Methods Programs Biomed. 2018 Mar;155:39-51. doi: 10.1016/j.cmpb.2017.11.021. Epub 2017 Nov 28.
Infant cry signal carries several levels of information about the reason for crying (hunger, pain, sleepiness and discomfort) or the pathological status (asphyxia, deaf, jaundice, premature condition and autism, etc.) of an infant and therefore suited for early diagnosis. In this work, combination of wavelet packet based features and Improved Binary Dragonfly Optimization based feature selection method was proposed to classify the different types of infant cry signals.
Cry signals from 2 different databases were utilized. First database contains 507 cry samples of normal (N), 340 cry samples of asphyxia (A), 879 cry samples of deaf (D), 350 cry samples of hungry (H) and 192 cry samples of pain (P). Second database contains 513 cry samples of jaundice (J), 531 samples of premature (Prem) and 45 samples of normal (N). Wavelet packet transform based energy and non-linear entropies (496 features), Linear Predictive Coding (LPC) based cepstral features (56 features), Mel-frequency Cepstral Coefficients (MFCCs) were extracted (16 features). The combined feature set consists of 568 features. To overcome the curse of dimensionality issue, improved binary dragonfly optimization algorithm (IBDFO) was proposed to select the most salient attributes or features. Finally, Extreme Learning Machine (ELM) kernel classifier was used to classify the different types of infant cry signals using all the features and highly informative features as well.
Several experiments of two-class and multi-class classification of cry signals were conducted. In binary or two-class experiments, maximum accuracy of 90.18% for H Vs P, 100% for A Vs N, 100% for D Vs N and 97.61% J Vs Prem was achieved using the features selected (only 204 features out of 568) by IBDFO. For the classification of multiple cry signals (multi-class problem), the selected features could differentiate between three classes (N, A & D) with the accuracy of 100% and seven classes with the accuracy of 97.62%.
The experimental results indicated that the proposed combination of feature extraction and selection method offers suitable classification accuracy and may be employed to detect the subtle changes in the cry signals.
婴儿哭声信号携带有关于哭泣原因(饥饿、疼痛、困倦和不适)或婴儿病理状态(窒息、耳聋、黄疸、早产和自闭症等)的多个层次的信息,因此适合早期诊断。在这项工作中,提出了一种基于小波包的特征和改进二进制蜻蜓优化特征选择方法的组合,以对不同类型的婴儿哭声信号进行分类。
利用了来自两个不同数据库的哭声信号。第一个数据库包含 507 个正常(N)哭声样本、340 个窒息(A)哭声样本、879 个耳聋(D)哭声样本、350 个饥饿(H)哭声样本和 192 个疼痛(P)哭声样本。第二个数据库包含 513 个黄疸(J)哭声样本、531 个早产(Prem)哭声样本和 45 个正常(N)哭声样本。基于小波包变换的能量和非线性熵(496 个特征)、线性预测编码(LPC)基倒谱特征(56 个特征)、梅尔频率倒谱系数(MFCCs)(16 个特征)。组合特征集由 568 个特征组成。为了克服维数灾难问题,提出了改进二进制蜻蜓优化算法(IBDFO)来选择最显著的属性或特征。最后,使用所有特征和高度信息特征,使用极限学习机(ELM)核分类器对不同类型的婴儿哭声信号进行分类。
进行了哭声信号的两类和多类分类的几项实验。在二进制或两类实验中,使用 IBDFO 选择的特征(仅 568 个特征中的 204 个),H 对 P 的最大准确率为 90.18%,A 对 N 的准确率为 100%,D 对 N 的准确率为 100%,J 对 Prem 的准确率为 97.61%。对于多类哭声信号(多类问题)的分类,选择的特征可以将三类(N、A 和 D)之间的差异分类,准确率为 100%,将七类之间的差异分类,准确率为 97.62%。
实验结果表明,所提出的特征提取和选择方法的组合提供了合适的分类准确率,可用于检测哭声信号的细微变化。