Department of Computer Engineering, Bahcesehir University, Istanbul, Turkey,
J Med Syst. 2010 Aug;34(4):591-9. doi: 10.1007/s10916-009-9272-y. Epub 2009 Mar 14.
Parkinson's disease (PD) is a neurological illness which impairs motor skills, speech, and other functions such as mood, behavior, thinking, and sensation. It causes vocal impairment for approximately 90% of the patients. As the symptoms of PD occur gradually and mostly targeting the elderly people for whom physical visits to the clinic are inconvenient and costly, telemonitoring of the disease using measurements of dysphonia (vocal features) has a vital role in its early diagnosis. Such dysphonia features extracted from the voice come in variety and most of them are interrelated. The purpose of this study is twofold: (1) to select a minimal subset of features with maximal joint relevance to the PD-score, a binary score indicating whether or not the sample belongs to a person with PD; and (2) to build a predictive model with minimal bias (i.e. to maximize the generalization of the predictions so as to perform well with unseen test examples). For these tasks, we apply the mutual information measure with the permutation test for assessing the relevance and the statistical significance of the relations between the features and the PD-score, rank the features according to the maximum-relevance-minimum-redundancy (mRMR) criterion, use a Support Vector Machine (SVM) for building a classification model and test it with a more suitable cross-validation scheme that we called leave-one-individual-out that fits with the dataset in hand better than the conventional bootstrapping or leave-one-out validation methods.
帕金森病(PD)是一种神经系统疾病,会损害运动技能、言语和其他功能,如情绪、行为、思维和感觉。它会导致大约 90%的患者出现声音障碍。由于 PD 的症状逐渐出现,且主要针对行动不便和费用昂贵的老年人,因此使用语音测量(声音特征)对疾病进行远程监测在早期诊断中起着至关重要的作用。从声音中提取的这些语音特征种类繁多,大多数都是相互关联的。本研究有两个目的:(1)选择一组具有最大联合相关性的最小特征子集,该特征集与 PD 评分(一个二进制分数,表示样本是否属于 PD 患者)有关;(2)构建一个具有最小偏差的预测模型(即最大程度地提高预测的泛化能力,以便在看不见的测试样本上表现良好)。对于这些任务,我们应用互信息测量和置换检验来评估特征与 PD 评分之间的相关性和统计显著性,根据最大相关性最小冗余(mRMR)准则对特征进行排序,使用支持向量机(SVM)构建分类模型,并使用更适合的交叉验证方案进行测试,这种方案称为个体留一交叉验证,比传统的自助抽样或留一法验证方法更适合手头的数据集。