• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过使用偏最小二乘法和套索算法的加速失效时间模型从微阵列数据预测患者生存率。

Predicting patient survival from microarray data by accelerated failure time modeling using partial least squares and LASSO.

作者信息

Datta Susmita, Le-Rademacher Jennifer, Datta Somnath

机构信息

Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, Kentucky 40202, USA.

出版信息

Biometrics. 2007 Mar;63(1):259-71. doi: 10.1111/j.1541-0420.2006.00660.x.

DOI:10.1111/j.1541-0420.2006.00660.x
PMID:17447952
Abstract

We consider the problem of predicting survival times of cancer patients from the gene expression profiles of their tumor samples via linear regression modeling of log-transformed failure times. The partial least squares (PLS) and least absolute shrinkage and selection operator (LASSO) methodologies are used for this purpose where we first modify the data to account for censoring. Three approaches of handling right censored data-reweighting, mean imputation, and multiple imputation-are considered. Their performances are examined in a detailed simulation study and compared with that of full data PLS and LASSO had there been no censoring. A major objective of this article is to investigate the performances of PLS and LASSO in the context of microarray data where the number of covariates is very large and there are extremely few samples. We demonstrate that LASSO outperforms PLS in terms of prediction error when the list of covariates includes a moderate to large percentage of useless or noise variables; otherwise, PLS may outperform LASSO. For a moderate sample size (100 with 10,000 covariates), LASSO performed better than a no covariate model (or noise-based prediction). The mean imputation method appears to best track the performance of the full data PLS or LASSO. The mean imputation scheme is used on an existing data set on lung cancer. This reanalysis using the mean imputed PLS and LASSO identifies a number of genes that were known to be related to cancer or tumor activities from previous studies.

摘要

我们考虑通过对对数变换后的失效时间进行线性回归建模,从癌症患者肿瘤样本的基因表达谱预测其生存时间的问题。为此使用了偏最小二乘法(PLS)和最小绝对收缩与选择算子(LASSO)方法,其中我们首先对数据进行修改以考虑删失情况。考虑了处理右删失数据的三种方法——重新加权、均值插补和多重插补。在详细的模拟研究中检验了它们的性能,并与无删失情况下完整数据的PLS和LASSO的性能进行比较。本文的一个主要目标是研究PLS和LASSO在协变量数量非常大且样本极少的微阵列数据背景下的性能。我们证明,当协变量列表包含中等至较大比例的无用或噪声变量时,LASSO在预测误差方面优于PLS;否则,PLS可能优于LASSO。对于中等样本量(100个样本和10000个协变量),LASSO的表现优于无协变量模型(或基于噪声的预测)。均值插补方法似乎最能追踪完整数据PLS或LASSO的性能。均值插补方案应用于现有的肺癌数据集。使用均值插补后的PLS和LASSO进行的重新分析识别出了一些在先前研究中已知与癌症或肿瘤活动相关的基因。

相似文献

1
Predicting patient survival from microarray data by accelerated failure time modeling using partial least squares and LASSO.通过使用偏最小二乘法和套索算法的加速失效时间模型从微阵列数据预测患者生存率。
Biometrics. 2007 Mar;63(1):259-71. doi: 10.1111/j.1541-0420.2006.00660.x.
2
Partial least squares dimension reduction for microarray gene expression data with a censored response.具有删失响应的微阵列基因表达数据的偏最小二乘降维法
Math Biosci. 2005 Jan;193(1):119-37. doi: 10.1016/j.mbs.2004.10.007. Epub 2005 Jan 22.
3
Predicting survival from microarray data--a comparative study.从微阵列数据预测生存率——一项比较研究。
Bioinformatics. 2007 Aug 15;23(16):2080-7. doi: 10.1093/bioinformatics/btm305. Epub 2007 Jun 6.
4
Effects of nonlinearities and uncorrelated or correlated errors in realistic simulated data on the prediction abilities of augmented classical least squares and partial least squares.现实模拟数据中的非线性以及不相关或相关误差对增强经典最小二乘法和偏最小二乘法预测能力的影响。
Appl Spectrosc. 2004 Sep;58(9):1065-73. doi: 10.1366/0003702041959334.
5
High-dimensional Cox models: the choice of penalty as part of the model building process.高维Cox模型:作为模型构建过程一部分的惩罚项选择
Biom J. 2010 Feb;52(1):50-69. doi: 10.1002/bimj.200900064.
6
Regularized estimation in the accelerated failure time model with high-dimensional covariates.具有高维协变量的加速失效时间模型中的正则化估计。
Biometrics. 2006 Sep;62(3):813-20. doi: 10.1111/j.1541-0420.2006.00562.x.
7
Robust imputation method for missing values in microarray data.微阵列数据中缺失值的稳健插补方法。
BMC Bioinformatics. 2007 May 3;8 Suppl 2(Suppl 2):S6. doi: 10.1186/1471-2105-8-S2-S6.
8
Dimension reduction methods for microarrays with application to censored survival data.用于微阵列的降维方法及其在删失生存数据中的应用。
Bioinformatics. 2004 Dec 12;20(18):3406-12. doi: 10.1093/bioinformatics/bth415. Epub 2004 Jul 15.
9
Cox survival analysis of microarray gene expression data using correlation principal component regression.使用相关主成分回归对微阵列基因表达数据进行Cox生存分析。
Stat Appl Genet Mol Biol. 2007;6:Article16. doi: 10.2202/1544-6115.1153. Epub 2007 May 29.
10
Borrowing information from relevant microarray studies for sample classification using weighted partial least squares.利用加权偏最小二乘法从相关微阵列研究中借用信息进行样本分类。
Comput Biol Chem. 2005 Jun;29(3):204-11. doi: 10.1016/j.compbiolchem.2005.04.002.

引用本文的文献

1
asmbPLS: biomarker identification and patient survival prediction with multi-omics data.asmbPLS:利用多组学数据进行生物标志物识别和患者生存预测
Front Genet. 2024 Nov 22;15:1444054. doi: 10.3389/fgene.2024.1444054. eCollection 2024.
2
EFFICIENT ESTIMATION OF THE MAXIMAL ASSOCIATION BETWEEN MULTIPLE PREDICTORS AND A SURVIVAL OUTCOME.多个预测因素与生存结局之间最大关联的有效估计
Ann Stat. 2023 Oct;51(5):1965-1988. doi: 10.1214/23-aos2313. Epub 2023 Dec 14.
3
Estimation of Norm Penalized Models: A Statistical Treatment.规范惩罚模型的估计:一种统计处理方法。
Comput Stat Data Anal. 2024 Apr;192. doi: 10.1016/j.csda.2023.107902. Epub 2023 Dec 6.
4
Regularized Buckley-James method for right-censored outcomes with block-missing multimodal covariates.用于具有块状缺失多模态协变量的右删失结局的正则化Buckley-James方法。
Stat (Int Stat Inst). 2022 Dec;11(1). doi: 10.1002/sta4.515. Epub 2022 Oct 13.
5
Development of an implantable collamer lens sizing model: a retrospective study using ANTERION swept-source optical coherence tomography and a literature review.开发可植入 Collamer 透镜尺寸模型:使用 ANTERION 扫频源光相干断层扫描的回顾性研究和文献复习。
BMC Ophthalmol. 2023 Feb 10;23(1):59. doi: 10.1186/s12886-023-02814-7.
6
Radiomics features of DSC-PWI in time dimension may provide a new chance to identify ischemic stroke.DSC-PWI在时间维度上的影像组学特征可能为识别缺血性卒中提供新的契机。
Front Neurol. 2022 Nov 4;13:889090. doi: 10.3389/fneur.2022.889090. eCollection 2022.
7
Predicting reoperation after operative treatment of proximal humerus fractures.预测肱骨近端骨折手术后再次手术的情况。
Eur J Orthop Surg Traumatol. 2021 Aug;31(6):1105-1112. doi: 10.1007/s00590-020-02841-w. Epub 2021 Jan 4.
8
lncRNAs classifier to accurately predict the recurrence of thymic epithelial tumors.lncRNAs 分类器可准确预测胸腺瘤的复发。
Thorac Cancer. 2020 Jul;11(7):1773-1783. doi: 10.1111/1759-7714.13439. Epub 2020 May 6.
9
Prognostic value of the expression of chemokines and their receptors in regional lymph nodes of melanoma patients.黑色素瘤患者区域淋巴结中趋化因子及其受体表达的预后价值。
J Cell Mol Med. 2020 Mar;24(6):3407-3418. doi: 10.1111/jcmm.15015. Epub 2020 Jan 26.
10
Marginal screening for high-dimensional predictors of survival outcomes.生存结局高维预测因子的边际筛选
Stat Sin. 2019 Oct;29(4):2105-2139. doi: 10.5705/ss.202017.0298.