Suppr超能文献

用于帕金森病不均衡数据集预测问题的混合特征选择框架

Hybrid Feature Selection Framework for the Parkinson Imbalanced Dataset Prediction Problem.

作者信息

Qasim Hayder Mohammed, Ata Oguz, Ansari Mohammad Azam, Alomary Mohammad N, Alghamdi Saad, Almehmadi Mazen

机构信息

Department of Electrical and Computer Engineering, Institute of Science, Altinbas University, Istanbul 34218, Turkey.

Department of Epidemic Disease Research, Institute for Research & Medical Consultations (IRMC), Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia.

出版信息

Medicina (Kaunas). 2021 Nov 8;57(11):1217. doi: 10.3390/medicina57111217.

Abstract

: Recently, many studies have focused on the early detection of Parkinson's disease (PD). This disease belongs to a group of neurological problems that immediately affect brain cells and influence the movement, hearing, and various cognitive functions. Medical data sets are often not equally distributed in their classes and this gives a bias in the classification of patients. We performed a Hybrid feature selection framework that can deal with imbalanced datasets like PD. Use the SOMTE algorithm to deal with unbalanced datasets. Removing the contradiction from the features in the dataset and decrease the processing time by using Recursive Feature Elimination (RFE), and Principle Component Analysis (PCA). : PD acoustic datasets and the characteristics of control subjects were used to construct classification models such as Bagging, K-nearest neighbour (KNN), multilayer perceptron, and the support vector machine (SVM). In the prepressing stage, the synthetic minority over-sampling technique (SMOTE) with two-feature selection RFE and PCA were used. The PD dataset comprises a large difference between the numbers of the infected and uninfected patients, which causes the classification bias problem. Therefore, SMOTE was used to resolve this problem. : For model evaluation, the train-test split technique was used for the experiment. All the models were Grid-search tuned, the evaluation results of the SVM model showed the highest accuracy of 98.2%, and the KNN model exhibited the highest specificity of 99%. : the proposed method is compared with the current modern methods of detecting Parkinson's disease and other methods for medical diseases, it was noted that our developed system could treat data bias and reach a high prediction of PD and this can be beneficial for health organizations to properly prioritize assets.

摘要

最近,许多研究都集中在帕金森病(PD)的早期检测上。这种疾病属于一组神经问题,会立即影响脑细胞并影响运动、听力和各种认知功能。医学数据集的类别分布往往不均衡,这在患者分类中会产生偏差。我们执行了一个混合特征选择框架,该框架可以处理像帕金森病这样的不平衡数据集。使用SOMTE算法来处理不平衡数据集。通过递归特征消除(RFE)和主成分分析(PCA)消除数据集中特征的矛盾并减少处理时间。使用帕金森病声学数据集和对照受试者的特征来构建分类模型,如Bagging、K近邻(KNN)、多层感知器和支持向量机(SVM)。在预处理阶段,使用了具有双特征选择RFE和PCA的合成少数过采样技术(SMOTE)。帕金森病数据集在感染和未感染患者数量之间存在很大差异,这导致了分类偏差问题。因此,使用SMOTE来解决这个问题。对于模型评估,实验使用了训练-测试分割技术。所有模型都通过网格搜索进行了调优,支持向量机模型的评估结果显示最高准确率为98.2%,KNN模型表现出最高特异性为99%。将所提出的方法与当前检测帕金森病的现代方法以及其他医学疾病的方法进行比较,结果表明我们开发的系统可以处理数据偏差并对帕金森病达到较高的预测准确率,这对卫生组织合理分配资源可能是有益的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e00/8619928/830781d8d9ec/medicina-57-01217-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验