• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于二元分类的概率增强型充分降维

Probability-enhanced sufficient dimension reduction for binary classification.

作者信息

Shin Seung Jun, Wu Yichao, Zhang Hao Helen, Liu Yufeng

机构信息

Department of Mathematics, University of Arizona, P.O. Box 210089, Tucson, Arizona 85721-0089, U.S.A.

出版信息

Biometrics. 2014 Sep;70(3):546-55. doi: 10.1111/biom.12174. Epub 2014 Apr 29.

DOI:10.1111/biom.12174
PMID:24779683
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4670268/
Abstract

In high-dimensional data analysis, it is of primary interest to reduce the data dimensionality without loss of information. Sufficient dimension reduction (SDR) arises in this context, and many successful SDR methods have been developed since the introduction of sliced inverse regression (SIR) [Li (1991) Journal of the American Statistical Association 86, 316-327]. Despite their fast progress, though, most existing methods target on regression problems with a continuous response. For binary classification problems, SIR suffers the limitation of estimating at most one direction since only two slices are available. In this article, we develop a new and flexible probability-enhanced SDR method for binary classification problems by using the weighted support vector machine (WSVM). The key idea is to slice the data based on conditional class probabilities of observations rather than their binary responses. We first show that the central subspace based on the conditional class probability is the same as that based on the binary response. This important result justifies the proposed slicing scheme from a theoretical perspective and assures no information loss. In practice, the true conditional class probability is generally not available, and the problem of probability estimation can be challenging for data with large-dimensional inputs. We observe that, in order to implement the new slicing scheme, one does not need exact probability values and the only required information is the relative order of probability values. Motivated by this fact, our new SDR procedure bypasses the probability estimation step and employs the WSVM to directly estimate the order of probability values, based on which the slicing is performed. The performance of the proposed probability-enhanced SDR scheme is evaluated by both simulated and real data examples.

摘要

在高维数据分析中,主要目标是在不损失信息的情况下降低数据维度。在这种背景下出现了充分降维(SDR),自切片逆回归(SIR)被引入以来,已经开发了许多成功的SDR方法[Li(1991)《美国统计协会杂志》86,316 - 327]。然而,尽管取得了快速进展,但大多数现有方法针对的是具有连续响应的回归问题。对于二元分类问题,SIR存在局限性,因为只有两个切片可用,所以最多只能估计一个方向。在本文中,我们通过使用加权支持向量机(WSVM)为二元分类问题开发了一种新的灵活的概率增强SDR方法。关键思想是根据观测值的条件类概率而不是其二元响应来对数据进行切片。我们首先表明基于条件类概率的中心子空间与基于二元响应的中心子空间相同。这一重要结果从理论角度证明了所提出的切片方案的合理性,并确保没有信息损失。在实践中,真实的条件类概率通常不可用,并且对于具有高维输入的数据,概率估计问题可能具有挑战性。我们观察到,为了实施新的切片方案,不需要精确的概率值,唯一需要的信息是概率值的相对顺序。受这一事实的启发,我们的新SDR过程绕过了概率估计步骤,而是使用WSVM直接估计概率值的顺序,并在此基础上进行切片。通过模拟和实际数据示例对所提出的概率增强SDR方案的性能进行了评估。

相似文献

1
Probability-enhanced sufficient dimension reduction for binary classification.用于二元分类的概率增强型充分降维
Biometrics. 2014 Sep;70(3):546-55. doi: 10.1111/biom.12174. Epub 2014 Apr 29.
2
A quantile-slicing approach for sufficient dimension reduction with censored responses.一种针对带有删失响应的充分维降维的分位数切片方法。
Biom J. 2021 Jan;63(1):201-212. doi: 10.1002/bimj.201900250. Epub 2020 Sep 9.
3
Principal weighted support vector machines for sufficient dimension reduction in binary classification.用于二元分类中充分降维的主加权支持向量机
Biometrika. 2017 Mar;104(1):67-81. doi: 10.1093/biomet/asw057. Epub 2017 Jan 19.
4
Two-Dimensional Solution Surface for Weighted Support Vector Machines.加权支持向量机的二维解曲面
J Comput Graph Stat. 2014 Apr 3;23(2):383-402. doi: 10.1080/10618600.2012.761139.
5
Sufficient dimension reduction for censored regressions.删失回归的充分降维
Biometrics. 2011 Jun;67(2):513-23. doi: 10.1111/j.1541-0420.2010.01490.x. Epub 2010 Sep 28.
6
Adjustment for missingness using auxiliary information in semiparametric regression.在半参数回归中使用辅助信息对缺失值进行调整。
Biometrics. 2010 Mar;66(1):115-22. doi: 10.1111/j.1541-0420.2009.01231.x. Epub 2009 May 7.
7
Sliced inverse regression with regularizations.带正则化的切片逆回归
Biometrics. 2008 Mar;64(1):124-31. doi: 10.1111/j.1541-0420.2007.00836.x. Epub 2007 Jul 25.
8
Sufficient dimension reduction via random-partitions for the large-p-small-n problem.针对高维小样本问题,通过随机划分实现充分降维。
Biometrics. 2019 Mar;75(1):245-255. doi: 10.1111/biom.12926. Epub 2018 Jul 27.
9
Sufficient dimension reduction for censored predictors.删失预测变量的充分降维
Biometrics. 2017 Mar;73(1):220-231. doi: 10.1111/biom.12556. Epub 2016 Aug 9.
10
Determining the number of clusters using the weighted gap statistic.使用加权间隙统计量确定聚类的数量。
Biometrics. 2007 Dec;63(4):1031-7. doi: 10.1111/j.1541-0420.2007.00784.x. Epub 2007 Apr 9.

引用本文的文献

1
Sufficient dimension reduction for classification using principal optimal transport direction.使用主最优传输方向进行分类的充分降维
Adv Neural Inf Process Syst. 2020 Dec;33:4015-4028.
2
Receiver operating characteristic curves and confidence bands for support vector machines.支持向量机的接收者操作特征曲线和置信带。
Biometrics. 2021 Dec;77(4):1422-1430. doi: 10.1111/biom.13365. Epub 2020 Sep 12.
3
Principal weighted support vector machines for sufficient dimension reduction in binary classification.用于二元分类中充分降维的主加权支持向量机
Biometrika. 2017 Mar;104(1):67-81. doi: 10.1093/biomet/asw057. Epub 2017 Jan 19.
4
Comments on: Probability Enhanced Effective Dimension Reduction for Classifying Sparse Functional Data.对《用于稀疏函数数据分类的概率增强有效降维》的评论
Test (Madr). 2016 Mar;25(1):44-46. doi: 10.1007/s11749-015-0474-y. Epub 2016 Jan 25.

本文引用的文献

1
Two-Dimensional Solution Surface for Weighted Support Vector Machines.加权支持向量机的二维解曲面
J Comput Graph Stat. 2014 Apr 3;23(2):383-402. doi: 10.1080/10618600.2012.761139.
2
ASYMPTOTIC PROPERTIES OF SUFFICIENT DIMENSION REDUCTION WITH A DIVERGING NUMBER OF PREDICTORS.具有发散数量预测变量的充分降维的渐近性质
Stat Sin. 2011;2011(21):707-730. doi: 10.5705/ss.2011.031a.
3
PLS dimension reduction for classification with microarray data.用于微阵列数据分类的偏最小二乘降维法
Stat Appl Genet Mol Biol. 2004;3:Article33. doi: 10.2202/1544-6115.1075. Epub 2004 Nov 23.
4
Using the Fisher kernel method to detect remote protein homologies.使用费舍尔核方法检测远程蛋白质同源性。
Proc Int Conf Intell Syst Mol Biol. 1999:149-58.