Suppr超能文献

使用逻辑回归模型集成对高维数据进行分类。

Classification of high-dimensional data with ensemble of logistic regression models.

作者信息

Lim Noha, Ahn Hongshik, Moon Hojin, Chen James J

机构信息

Immune Tolerance Network, University of California-San Francisco, San Francisco, California, USA.

出版信息

J Biopharm Stat. 2010 Jan;20(1):160-71. doi: 10.1080/10543400903280639.

Abstract

A classification method is developed based on ensembles of logistic regression models, with each model fitted from a different set of predictors determined by a random partition of the feature space. The proposed method enables class prediction by an ensemble of logistic regression models for a high-dimensional data set, which is impossible by a single logistic regression model due to the restriction that the sample size needs to be larger than the number of predictors. The proposed classification method is applied to gene expression data on pediatric acute myeloid leukemia (AML) patients to predict each patient's risk for treatment failure or relapse at the time of diagnosis. Hence, specific prognostic biomarkers can be used to predict outcomes in pediatric AML and formulate individual risk-adjusted treatment. Our study shows that the proposed method is comparable to other widely used models in generalized accuracy and is significantly improved in balance between sensitivity and specificity. The proposed ensemble algorithm enables the standard classification model to be used for classification of high-dimensional data.

摘要

基于逻辑回归模型的集成开发了一种分类方法,每个模型由通过特征空间的随机划分确定的不同预测变量集拟合而成。所提出的方法能够通过逻辑回归模型的集成对高维数据集进行分类预测,由于样本量需要大于预测变量数量的限制,单个逻辑回归模型无法做到这一点。所提出的分类方法应用于小儿急性髓系白血病(AML)患者的基因表达数据,以预测每个患者在诊断时治疗失败或复发的风险。因此,特定的预后生物标志物可用于预测小儿AML的预后并制定个体风险调整治疗方案。我们的研究表明,所提出的方法在广义准确性方面与其他广泛使用的模型相当,并且在敏感性和特异性之间的平衡方面有显著提高。所提出的集成算法使标准分类模型能够用于高维数据的分类。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验