Suppr超能文献

浮动窗口投影分离器(FloWPS):一种用于支持向量机(SVM)的数据修剪工具,以提高分类器的鲁棒性。

FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier.

作者信息

Tkachev Victor, Sorokin Maxim, Mescheryakov Artem, Simonov Alexander, Garazha Andrew, Buzdin Anton, Muchnik Ilya, Borisov Nicolas

机构信息

Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, Walnut, CA, United States.

Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.

出版信息

Front Genet. 2019 Jan 15;9:717. doi: 10.3389/fgene.2018.00717. eCollection 2018.

Abstract

Here, we propose a heuristic technique of data trimming for SVM termed (), tailored for personalized predictions based on molecular data. This procedure can operate with high throughput genetic datasets like gene expression or mutation profiles. Its application prevents SVM from extrapolation by excluding non-informative features. FloWPS requires training on the data for the individuals with known clinical outcomes to create a clinically relevant classifier. The genetic profiles linked with the outcomes are broken as usual into the training and validation datasets. The unique property of FloWPS is that irrelevant features in dataset that don't have significant number of neighboring hits in the dataset are removed from further analyses. Next, similarly to the nearest neighbors (kNN) method, for each point of a dataset, FloWPS takes into account only the proximal points of the dataset. Thus, for every point of a dataset, the dataset is adjusted to form a . FloWPS performance was tested on ten gene expression datasets for 992 cancer patients either responding or not on the different types of chemotherapy. We experimentally confirmed by leave-one-out cross-validation that FloWPS enables to significantly increase quality of a classifier built based on the classical SVM in most of the applications, particularly for polynomial kernels.

摘要

在此,我们提出一种用于支持向量机(SVM)的数据修剪启发式技术,称为(),它是为基于分子数据的个性化预测量身定制的。此过程可处理高通量遗传数据集,如基因表达或突变谱。其应用通过排除无信息特征来防止支持向量机进行外推。FloWPS 需要对具有已知临床结果的个体的数据进行训练,以创建一个与临床相关的分类器。与结果相关的遗传谱通常会被分为训练数据集和验证数据集。FloWPS 的独特之处在于,从进一步分析中移除了数据集中在数据集中没有大量相邻命中数的无关特征。接下来,与最近邻(kNN)方法类似,对于数据集的每个点,FloWPS 仅考虑数据集的近端点。因此,对于数据集的每个点,数据集会进行调整以形成一个。在针对 992 名癌症患者的十个基因表达数据集上测试了 FloWPS 的性能,这些患者对不同类型的化疗有反应或无反应。我们通过留一法交叉验证实验证实,在大多数应用中,特别是对于多项式核,FloWPS 能够显著提高基于经典支持向量机构建的分类器的质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb85/6341065/8b9ac323d173/fgene-09-00717-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验