Zhang He-Hua, Yang Liuyang, Liu Yuchuan, Wang Pin, Yin Jun, Li Yongming, Qiu Mingguo, Zhu Xueru, Yan Fang
Institute of Surgery Research, Daping Hospital, Third Military Medical University, Chongqing, 400042, China.
College of Communication Engineering, Chongqing University, Chongqing, 400044, China.
Biomed Eng Online. 2016 Nov 16;15(1):122. doi: 10.1186/s12938-016-0242-6.
The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined.
In this study, a PD classification algorithm was proposed and examined that combines a multi-edit-nearest-neighbor (MENN) algorithm and an ensemble learning algorithm. First, the MENN algorithm is applied for selecting optimal training speech samples iteratively, thereby obtaining samples with high separability. Next, an ensemble learning algorithm, random forest (RF) or decorrelated neural network ensembles (DNNE), is used to generate trained samples from the collected training samples. Lastly, the trained ensemble learning algorithms are applied to the test samples for PD classification. This proposed method was examined using a more recently deposited public datasets and compared against other currently used algorithms for validation.
Experimental results showed that the proposed algorithm obtained the highest degree of improved classification accuracy (29.44%) compared with the other algorithm that was examined. Furthermore, the MENN algorithm alone was found to improve classification accuracy by as much as 45.72%. Moreover, the proposed algorithm was found to exhibit a higher stability, particularly when combining the MENN and RF algorithms.
This study showed that the proposed method could improve PD classification when using speech data and can be applied to future studies seeking to improve PD classification methods.
近年来,基于语音数据的帕金森病(PD)分类已被证明是一种有效的非侵入性分类方式。因此,人们对适用于帕金森症的语音模式分析方法越来越感兴趣,以构建预测性远程诊断和远程监测模型。优化分类的障碍之一是减少所收集语音样本中的噪声,从而确保更高的分类准确性和稳定性。虽然目前使用的方法是有效的,但很少有人研究调用实例选择的能力。
在本研究中,提出并检验了一种将多编辑最近邻(MENN)算法和集成学习算法相结合的PD分类算法。首先,应用MENN算法迭代选择最优训练语音样本,从而获得具有高可分性的样本。接下来,使用集成学习算法,即随机森林(RF)或去相关神经网络集成(DNNE),从收集到的训练样本中生成训练样本。最后,将训练好的集成学习算法应用于测试样本进行PD分类。使用最近存入的公共数据集对该方法进行检验,并与其他当前使用的算法进行比较以进行验证。
实验结果表明,与所检验的其他算法相比,该算法获得了最高程度的分类准确率提高(29.44%)。此外,单独的MENN算法被发现可将分类准确率提高多达45.72%。而且,该算法表现出更高的稳定性,特别是在结合MENN和RF算法时。
本研究表明,所提出的方法在使用语音数据时可提高PD分类准确率,并且可应用于未来旨在改进PD分类方法的研究。