Suppr超能文献

k跳n元语法随机森林法:一种基于随机森林的阿尔茨海默病蛋白质识别方法。

k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification.

作者信息

Xu Lei, Liang Guangmin, Liao Changrui, Chen Gin-Den, Chang Chi-Chang

机构信息

School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China.

Key Laboratory of Optoelectronic Devices and Systems of Ministry of Education and Guangdong Province, College of Optoelectronic Engineering, Shenzhen University, Shenzhen, China.

出版信息

Front Genet. 2019 Feb 12;10:33. doi: 10.3389/fgene.2019.00033. eCollection 2019.

Abstract

In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MRI) technique. Most methods have attained acceptable results, but the cost is expensive and time consuming. Thus, we proposed a computational method for identifying Alzheimer disease genes by use of the sequence information of proteins, and classify the feature vectors by random forest. In the proposed method, the gene protein information is extracted by adaptive k-skip-n-gram features. The proposed method can attain the accuracy to 85.5% on the selected UniProt dataset, which has been demonstrated by the experimental results.

摘要

本文提出了一种基于机器学习技术的阿尔茨海默病基因识别计算方法。与大多数现有的基于机器学习的方法相比,现有方法通过使用结构磁共振成像(MRI)技术来预测阿尔茨海默病基因。大多数方法都取得了可接受的结果,但成本昂贵且耗时。因此,我们提出了一种利用蛋白质序列信息识别阿尔茨海默病基因的计算方法,并通过随机森林对特征向量进行分类。在所提出的方法中,基因蛋白质信息通过自适应k-跳-n-gram特征提取。实验结果表明,该方法在选定的UniProt数据集上的准确率可达85.5%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c552/6379451/598287ec229f/fgene-10-00033-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验