Suppr超能文献

一种用于选择对中风预后至关重要的特征的机器学习方法。

A machine learning approach to select features important to stroke prognosis.

作者信息

Fang Gang, Liu Wenbin, Wang Lixin

机构信息

Institute of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China.

Institute of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China.

出版信息

Comput Biol Chem. 2020 Oct;88:107316. doi: 10.1016/j.compbiolchem.2020.107316. Epub 2020 Jun 23.

Abstract

Ischemic stroke is a common neurological disorder, and is still the principal cause of serious long-term disability in the world. Selection of features related to stroke prognosis is highly valuable for effective intervention and treatment. In this study, an integrated machine learning approach was used to select the features as prognosis factors of stroke on The International Stroke Trial (IST) dataset. We considered the common problems of feature selection and prediction in medical datasets. Firstly, the importance of features was ranked by the Shapiro-Wilk algorithm and the Pearson correlations between features were analyzed. Then, we used Recursive Feature Elimination with Cross-Validation (RFECV), which incorporated linear SVC, Random-Forest-Classifier, Extra-Trees-Classifier, AdaBoost-Classifier, and Multinomial-Naïve-Bayes-Classifier as estimator respectively, to select robust features. Furthermore, the importance of selected features was determined by Random-Forest-Classifier and Shapiro-Wilk algorithm. Finally, twenty-three selected features were used by SVC, MLP, Random-Forest, and AdaBoost-Classifier to predict the RVISINF (Infarct visible on CT) of acute stroke on IST dataset. It was suggested that the selected features could be used to infer the long-term prognosis of acute stroke at a high accuracy, and it also could be used to extract factors related to RVISINF, which is associated with large artery occlusion (LAO) in ischemic stroke patient.

摘要

缺血性中风是一种常见的神经系统疾病,仍然是全球严重长期残疾的主要原因。选择与中风预后相关的特征对于有效的干预和治疗具有很高的价值。在本研究中,使用了一种集成机器学习方法在国际中风试验(IST)数据集上选择作为中风预后因素的特征。我们考虑了医学数据集中特征选择和预测的常见问题。首先,通过夏皮罗-威尔克算法对特征的重要性进行排序,并分析特征之间的皮尔逊相关性。然后,我们使用带交叉验证的递归特征消除(RFECV),该方法分别将线性支持向量分类器、随机森林分类器、极端随机树分类器、自适应增强分类器和多项式朴素朴素贝斯朴素贝叶斯分类器作为估计器来选择稳健的特征。此外,通过随机森林分类器和夏皮罗-威尔克算法确定所选特征的重要性。最后,使用支持向量分类器、多层感知器、随机森林和自适应增强分类器对IST数据集上急性中风的梗死灶可见于CT(RVISINF)进行预测。结果表明,所选特征可用于高精度推断急性中风的长期预后,还可用于提取与RVISINF相关的因素,RVISINF与缺血性中风患者的大动脉闭塞(LAO)有关。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验