• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

测试高维分子数据的额外预测价值。

Testing the additional predictive value of high-dimensional molecular data.

机构信息

Department of Medical Informatics, Biometry and Epidemiology, University of Munich, Marchioninistr 15, D-81377 Munich, Germany.

出版信息

BMC Bioinformatics. 2010 Feb 8;11:78. doi: 10.1186/1471-2105-11-78.

DOI:10.1186/1471-2105-11-78
PMID:20144191
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2837029/
Abstract

BACKGROUND

While high-dimensional molecular data such as microarray gene expression data have been used for disease outcome prediction or diagnosis purposes for about ten years in biomedical research, the question of the additional predictive value of such data given that classical predictors are already available has long been under-considered in the bioinformatics literature.

RESULTS

We suggest an intuitive permutation-based testing procedure for assessing the additional predictive value of high-dimensional molecular data. Our method combines two well-known statistical tools: logistic regression and boosting regression. We give clear advice for the choice of the only method parameter (the number of boosting iterations). In simulations, our novel approach is found to have very good power in different settings, e.g. few strong predictors or many weak predictors. For illustrative purpose, it is applied to the two publicly available cancer data sets.

CONCLUSIONS

Our simple and computationally efficient approach can be used to globally assess the additional predictive power of a large number of candidate predictors given that a few clinical covariates or a known prognostic index are already available. It is implemented in the R package "globalboosttest" which is publicly available from R-forge and will be sent to the CRAN as soon as possible.

摘要

背景

尽管高维分子数据(如微阵列基因表达数据)在生物医学研究中已经用于疾病预后预测或诊断目的约十年,但在生物信息学文献中,对于给定已经存在的经典预测因子,此类数据的额外预测价值的问题长期以来一直被忽视。

结果

我们建议了一种直观的基于置换的测试程序,用于评估高维分子数据的额外预测价值。我们的方法结合了两个著名的统计工具:逻辑回归和提升回归。我们为唯一的方法参数(提升迭代次数)提供了明确的选择建议。在模拟中,我们的新方法在不同的设置中具有很好的功效,例如少数强预测因子或许多弱预测因子。为了说明目的,我们将其应用于两个公开的癌症数据集。

结论

我们的简单且计算效率高的方法可用于在已经存在少数临床协变量或已知预后指数的情况下,全局评估大量候选预测因子的额外预测能力。它在 R 包“globalboosttest”中实现,该包可从 R-forge 获得,并将尽快发送到 CRAN。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb18/2837029/b6bfb899b6ce/1471-2105-11-78-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb18/2837029/240cb14dedb3/1471-2105-11-78-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb18/2837029/3ea614ab73e7/1471-2105-11-78-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb18/2837029/b6bfb899b6ce/1471-2105-11-78-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb18/2837029/240cb14dedb3/1471-2105-11-78-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb18/2837029/3ea614ab73e7/1471-2105-11-78-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb18/2837029/b6bfb899b6ce/1471-2105-11-78-3.jpg

相似文献

1
Testing the additional predictive value of high-dimensional molecular data.测试高维分子数据的额外预测价值。
BMC Bioinformatics. 2010 Feb 8;11:78. doi: 10.1186/1471-2105-11-78.
2
Boosting for high-dimensional time-to-event data with competing risks.具有竞争风险的高维生存时间数据的增强方法
Bioinformatics. 2009 Apr 1;25(7):890-6. doi: 10.1093/bioinformatics/btp088. Epub 2009 Feb 25.
3
Assessing statistical significance in microarray experiments using the distance between microarrays.利用微阵列之间的距离评估微阵列实验中的统计学显著性。
PLoS One. 2009 Jun 16;4(6):e5838. doi: 10.1371/journal.pone.0005838.
4
Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value.基于微阵列的分类及临床预测指标:关于联合分类器及附加预测价值
Bioinformatics. 2008 Aug 1;24(15):1698-706. doi: 10.1093/bioinformatics/btn262. Epub 2008 Jun 9.
5
Regularized estimation of large-scale gene association networks using graphical Gaussian models.基于图式高斯模型的大规模基因关联网络正则化估计
BMC Bioinformatics. 2009 Nov 24;10:384. doi: 10.1186/1471-2105-10-384.
6
Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery.微阵列转录数据中存在许多准确的小判别特征子集:生物标志物发现。
BMC Bioinformatics. 2005 Apr 13;6:97. doi: 10.1186/1471-2105-6-97.
7
Boosting for high-dimensional two-class prediction.用于高维二类预测的提升算法。
BMC Bioinformatics. 2015 Sep 21;16:300. doi: 10.1186/s12859-015-0723-9.
8
TTCA: an R package for the identification of differentially expressed genes in time course microarray data.TTCA:一个用于在时间进程微阵列数据中鉴定差异表达基因的R软件包。
BMC Bioinformatics. 2017 Jan 14;18(1):33. doi: 10.1186/s12859-016-1440-8.
9
Estimating dataset size requirements for classifying DNA microarray data.估计用于DNA微阵列数据分类的数据集大小要求。
J Comput Biol. 2003;10(2):119-42. doi: 10.1089/106652703321825928.
10
Tumor classification ranking from microarray data.基于微阵列数据的肿瘤分类排名
BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S21. doi: 10.1186/1471-2164-9-S2-S21.

引用本文的文献

1
A boosting first-hitting-time model for survival analysis in high-dimensional settings.一种用于高维环境下生存分析的提升首次命中时间模型。
Lifetime Data Anal. 2023 Apr;29(2):420-440. doi: 10.1007/s10985-022-09553-9. Epub 2022 Apr 27.
2
Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data.基于逻辑回归的 LS-PLS 的扩展的分类:在临床和多个基因组数据中的应用。
BMC Bioinformatics. 2018 Sep 6;19(1):314. doi: 10.1186/s12859-018-2311-2.
3
Bayesian variable selection logistic regression with paired proteomic measurements.

本文引用的文献

1
Over-optimism in bioinformatics research.生物信息学研究中的过度乐观情绪。
Bioinformatics. 2010 Feb 1;26(3):437-9. doi: 10.1093/bioinformatics/btp648. Epub 2009 Nov 26.
2
Comparative optimism in models involving both classical clinical and gene expression information.涉及经典临床和基因表达信息的模型中的比较性乐观主义。
BMC Bioinformatics. 2008 Oct 15;9:434. doi: 10.1186/1471-2105-9-434.
3
Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value.基于微阵列的分类及临床预测指标:关于联合分类器及附加预测价值
具有配对蛋白质组测量值的贝叶斯变量选择逻辑回归
Biom J. 2018 Sep;60(5):1003-1020. doi: 10.1002/bimj.201700182. Epub 2018 Jun 25.
4
Pathway-Based Kernel Boosting for the Analysis of Genome-Wide Association Studies.基于通路的核提升算法用于全基因组关联研究分析
Comput Math Methods Med. 2017;2017:6742763. doi: 10.1155/2017/6742763. Epub 2017 Jul 13.
5
Predicting Triple-Negative Breast Cancer Subtype Using Multiple Single Nucleotide Polymorphisms for Breast Cancer Risk and Several Variable Selection Methods.利用多个乳腺癌风险单核苷酸多态性及多种变量选择方法预测三阴性乳腺癌亚型
Geburtshilfe Frauenheilkd. 2017 Jun;77(6):667-678. doi: 10.1055/s-0043-111602. Epub 2017 Jun 28.
6
Boosting the discriminatory power of sparse survival models via optimization of the concordance index and stability selection.通过优化一致性指数和稳定性选择提高稀疏生存模型的判别能力。
BMC Bioinformatics. 2016 Jul 22;17:288. doi: 10.1186/s12859-016-1149-8.
7
Comparison of classification methods that combine clinical data and high-dimensional mass spectrometry data.结合临床数据和高维质谱数据的分类方法比较。
BMC Bioinformatics. 2014 Nov 29;15(1):385. doi: 10.1186/s12859-014-0385-z.
8
Added predictive value of omics data: specific issues related to validation illustrated by two case studies.组学数据的附加预测价值:通过两个案例研究说明的与验证相关的具体问题。
BMC Med Res Methodol. 2014 Oct 28;14:117. doi: 10.1186/1471-2288-14-117.
9
Postextrasystolic blood pressure potentiation predicts poor outcome of cardiac patients.早搏后血压增强预示着心脏病患者的不良预后。
J Am Heart Assoc. 2014 Jun 3;3(3):e000857. doi: 10.1161/JAHA.114.000857.
10
Comparisons of single-stage and two-stage approaches to genomic selection.单阶段和两阶段基因组选择方法的比较。
Theor Appl Genet. 2013 Jan;126(1):69-82. doi: 10.1007/s00122-012-1960-1. Epub 2012 Aug 19.
Bioinformatics. 2008 Aug 1;24(15):1698-706. doi: 10.1093/bioinformatics/btn262. Epub 2008 Jun 9.
4
Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models.在稀疏高维生存模型的提升估计中考虑强制协变量。
BMC Bioinformatics. 2008 Jan 10;9:14. doi: 10.1186/1471-2105-9-14.
5
Improved breast cancer prognosis through the combination of clinical and genetic markers.通过临床和基因标志物相结合改善乳腺癌预后。
Bioinformatics. 2007 Jan 1;23(1):30-7. doi: 10.1093/bioinformatics/btl543. Epub 2006 Nov 26.
6
Model-based boosting in high dimensions.高维空间中基于模型的提升算法
Bioinformatics. 2006 Nov 15;22(22):2828-9. doi: 10.1093/bioinformatics/btl462. Epub 2006 Aug 29.
7
Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks.通过贝叶斯网络整合临床和微阵列数据预测乳腺癌的预后。
Bioinformatics. 2006 Jul 15;22(14):e184-90. doi: 10.1093/bioinformatics/btl230.
8
Pre-validation and inference in microarrays.微阵列中的预验证和推断
Stat Appl Genet Mol Biol. 2002;1:Article1. doi: 10.2202/1544-6115.1000. Epub 2002 Aug 22.
9
Testing association of a pathway with survival using gene expression data.利用基因表达数据测试一条信号通路与生存情况的关联性。
Bioinformatics. 2005 May 1;21(9):1950-7. doi: 10.1093/bioinformatics/bti267. Epub 2005 Jan 18.
10
"Good Old" clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers.“经典”临床标志物在乳腺癌预后评估方面与基因芯片基因表达谱分析具有相似的效能。
Eur J Cancer. 2004 Aug;40(12):1837-41. doi: 10.1016/j.ejca.2004.02.025.