Suppr超能文献

浮动窗口投影分离器(FloWPS):一种用于支持向量机(SVM)的数据修剪工具,以提高分类器的鲁棒性。

FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier.

作者信息

Tkachev Victor, Sorokin Maxim, Mescheryakov Artem, Simonov Alexander, Garazha Andrew, Buzdin Anton, Muchnik Ilya, Borisov Nicolas

机构信息

Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, Walnut, CA, United States.

Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.

出版信息

Front Genet. 2019 Jan 15;9:717. doi: 10.3389/fgene.2018.00717. eCollection 2018.

Abstract

Here, we propose a heuristic technique of data trimming for SVM termed (), tailored for personalized predictions based on molecular data. This procedure can operate with high throughput genetic datasets like gene expression or mutation profiles. Its application prevents SVM from extrapolation by excluding non-informative features. FloWPS requires training on the data for the individuals with known clinical outcomes to create a clinically relevant classifier. The genetic profiles linked with the outcomes are broken as usual into the training and validation datasets. The unique property of FloWPS is that irrelevant features in dataset that don't have significant number of neighboring hits in the dataset are removed from further analyses. Next, similarly to the nearest neighbors (kNN) method, for each point of a dataset, FloWPS takes into account only the proximal points of the dataset. Thus, for every point of a dataset, the dataset is adjusted to form a . FloWPS performance was tested on ten gene expression datasets for 992 cancer patients either responding or not on the different types of chemotherapy. We experimentally confirmed by leave-one-out cross-validation that FloWPS enables to significantly increase quality of a classifier built based on the classical SVM in most of the applications, particularly for polynomial kernels.

摘要

在此,我们提出一种用于支持向量机(SVM)的数据修剪启发式技术,称为(),它是为基于分子数据的个性化预测量身定制的。此过程可处理高通量遗传数据集,如基因表达或突变谱。其应用通过排除无信息特征来防止支持向量机进行外推。FloWPS 需要对具有已知临床结果的个体的数据进行训练,以创建一个与临床相关的分类器。与结果相关的遗传谱通常会被分为训练数据集和验证数据集。FloWPS 的独特之处在于,从进一步分析中移除了数据集中在数据集中没有大量相邻命中数的无关特征。接下来,与最近邻(kNN)方法类似,对于数据集的每个点,FloWPS 仅考虑数据集的近端点。因此,对于数据集的每个点,数据集会进行调整以形成一个。在针对 992 名癌症患者的十个基因表达数据集上测试了 FloWPS 的性能,这些患者对不同类型的化疗有反应或无反应。我们通过留一法交叉验证实验证实,在大多数应用中,特别是对于多项式核,FloWPS 能够显著提高基于经典支持向量机构建的分类器的质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb85/6341065/8b9ac323d173/fgene-09-00717-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验