• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于基因组数据分析的基于非参数通路的回归模型。

Nonparametric pathway-based regression models for analysis of genomic data.

作者信息

Wei Zhi, Li Hongzhe

机构信息

Genomics and Computational Biology Graduate Group, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA.

出版信息

Biostatistics. 2007 Apr;8(2):265-84. doi: 10.1093/biostatistics/kxl007. Epub 2006 Jun 13.

DOI:10.1093/biostatistics/kxl007
PMID:16772399
Abstract

High-throughout genomic data provide an opportunity for identifying pathways and genes that are related to various clinical phenotypes. Besides these genomic data, another valuable source of data is the biological knowledge about genes and pathways that might be related to the phenotypes of many complex diseases. Databases of such knowledge are often called the metadata. In microarray data analysis, such metadata are currently explored in post hoc ways by gene set enrichment analysis but have hardly been utilized in the modeling step. We propose to develop and evaluate a pathway-based gradient descent boosting procedure for nonparametric pathways-based regression (NPR) analysis to efficiently integrate genomic data and metadata. Such NPR models consider multiple pathways simultaneously and allow complex interactions among genes within the pathways and can be applied to identify pathways and genes that are related to variations of the phenotypes. These methods also provide an alternative to mediating the problem of a large number of potential interactions by limiting analysis to biologically plausible interactions between genes in related pathways. Our simulation studies indicate that the proposed boosting procedure can indeed identify relevant pathways. Application to a gene expression data set on breast cancer distant metastasis identified that Wnt, apoptosis, and cell cycle-regulated pathways are more likely related to the risk of distant metastasis among lymph-node-negative breast cancer patients. Results from analysis of other two breast cancer gene expression data sets indicate that the pathways of Metalloendopeptidases (MMPs) and MMP inhibitors, as well as cell proliferation, cell growth, and maintenance are important to breast cancer relapse and survival. We also observed that by incorporating the pathway information, we achieved better prediction for cancer recurrence.

摘要

高通量基因组数据为识别与各种临床表型相关的通路和基因提供了契机。除了这些基因组数据外,另一个有价值的数据来源是关于可能与许多复杂疾病表型相关的基因和通路的生物学知识。此类知识的数据库通常被称为元数据。在微阵列数据分析中,目前此类元数据是通过基因集富集分析以事后方式进行探索的,但在建模步骤中几乎未被利用。我们建议开发并评估一种基于通路的梯度下降增强程序,用于非参数基于通路的回归(NPR)分析,以有效整合基因组数据和元数据。此类NPR模型同时考虑多个通路,并允许通路内基因之间存在复杂的相互作用,可用于识别与表型变异相关的通路和基因。这些方法还提供了一种替代方案,通过将分析限制在相关通路中基因之间生物学上合理的相互作用来解决大量潜在相互作用的问题。我们的模拟研究表明,所提出的增强程序确实能够识别相关通路。应用于乳腺癌远处转移的基因表达数据集发现,Wnt、凋亡和细胞周期调节通路更有可能与淋巴结阴性乳腺癌患者的远处转移风险相关。对其他两个乳腺癌基因表达数据集的分析结果表明,金属内肽酶(MMPs)及其抑制剂通路以及细胞增殖、细胞生长和维持对乳腺癌复发和生存很重要。我们还观察到,通过纳入通路信息,我们对癌症复发实现了更好的预测。

相似文献

1
Nonparametric pathway-based regression models for analysis of genomic data.用于基因组数据分析的基于非参数通路的回归模型。
Biostatistics. 2007 Apr;8(2):265-84. doi: 10.1093/biostatistics/kxl007. Epub 2006 Jun 13.
2
Group additive regression models for genomic data analysis.用于基因组数据分析的分组加法回归模型。
Biostatistics. 2008 Jan;9(1):100-13. doi: 10.1093/biostatistics/kxm015. Epub 2007 May 18.
3
A Markov random field model for network-based analysis of genomic data.一种用于基于网络的基因组数据分析的马尔可夫随机场模型。
Bioinformatics. 2007 Jun 15;23(12):1537-44. doi: 10.1093/bioinformatics/btm129. Epub 2007 May 5.
4
Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data.高维小样本情况下的惩罚Cox回归分析及其在微阵列基因表达数据中的应用
Bioinformatics. 2005 Jul 1;21(13):3001-8. doi: 10.1093/bioinformatics/bti422. Epub 2005 Apr 6.
5
Pathway analysis using random forests classification and regression.使用随机森林分类和回归的通路分析
Bioinformatics. 2006 Aug 15;22(16):2028-36. doi: 10.1093/bioinformatics/btl344. Epub 2006 Jun 29.
6
[Prognostic molecular classification of breast cancers based on gene expression profiling].基于基因表达谱的乳腺癌预后分子分类
Zhonghua Zhong Liu Za Zhi. 2006 Dec;28(12):900-6.
7
Using support vector regression to model the correlation between the clinical metastases time and gene expression profile for breast cancer.使用支持向量回归对乳腺癌临床转移时间与基因表达谱之间的相关性进行建模。
Artif Intell Med. 2008 Nov;44(3):221-31. doi: 10.1016/j.artmed.2008.06.005. Epub 2008 Aug 3.
8
[Identification of the differentially expressed genes between primary breast cancer and paired lymph node metastasis through combining mRNA differential display and gene microarray].通过结合mRNA差异显示和基因芯片技术鉴定原发性乳腺癌与配对淋巴结转移之间的差异表达基因
Zhonghua Yi Xue Za Zhi. 2006 Oct 24;86(39):2749-55.
9
Pathway analysis of gene signatures predicting metastasis of node-negative primary breast cancer.预测淋巴结阴性原发性乳腺癌转移的基因特征的通路分析。
BMC Cancer. 2007 Sep 25;7:182. doi: 10.1186/1471-2407-7-182.
10
Copy number alterations that predict metastatic capability of human breast cancer.预测人类乳腺癌转移能力的拷贝数改变。
Cancer Res. 2009 May 1;69(9):3795-801. doi: 10.1158/0008-5472.CAN-08-4596. Epub 2009 Mar 31.

引用本文的文献

1
Weighted overlapping group lasso for integrating prior network knowledge into gene set analysis.用于将先验网络知识整合到基因集分析中的加权重叠组套索法。
BMC Bioinformatics. 2025 Sep 1;26(1):226. doi: 10.1186/s12859-025-06170-9.
2
A novel non-negative Bayesian stacking modeling method for Cancer survival prediction using high-dimensional omics data.一种使用高维组学数据进行癌症生存预测的新型非负贝叶斯堆叠建模方法。
BMC Med Res Methodol. 2024 May 3;24(1):105. doi: 10.1186/s12874-024-02232-3.
3
A non-negative spike-and-slab lasso generalized linear stacking prediction modeling method for high-dimensional omics data.
一种用于高维组学数据的非负尖峰-板条套索广义线性堆叠预测建模方法。
BMC Bioinformatics. 2024 Mar 20;25(1):119. doi: 10.1186/s12859-024-05741-6.
4
A statistical boosting framework for polygenic risk scores based on large-scale genotype data.基于大规模基因型数据的多基因风险评分的统计增强框架。
Front Genet. 2023 Jan 10;13:1076440. doi: 10.3389/fgene.2022.1076440. eCollection 2022.
5
MCC-SP: a powerful integration method for identification of causal pathways from genetic variants to complex disease.MCC-SP:一种强大的整合方法,用于从遗传变异到复杂疾病的因果途径识别。
BMC Genet. 2020 Aug 26;21(1):90. doi: 10.1186/s12863-020-00899-3.
6
Statistics in the Genomic Era.基因组时代的统计学。
Genes (Basel). 2020 Apr 18;11(4):443. doi: 10.3390/genes11040443.
7
Incorporating biological structure into machine learning models in biomedicine.将生物结构纳入生物医学中的机器学习模型。
Curr Opin Biotechnol. 2020 Jun;63:126-134. doi: 10.1016/j.copbio.2019.12.021. Epub 2020 Jan 18.
8
A Pathway-Based Kernel Boosting Method for Sample Classification Using Genomic Data.基于通路的核提升方法在基因组数据样本分类中的应用。
Genes (Basel). 2019 Aug 31;10(9):670. doi: 10.3390/genes10090670.
9
Pathway-Structured Predictive Model for Cancer Survival Prediction: A Two-Stage Approach.用于癌症生存预测的通路结构预测模型:一种两阶段方法。
Genetics. 2017 Jan;205(1):89-100. doi: 10.1534/genetics.116.189191. Epub 2016 Nov 9.
10
Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection.重叠组逻辑回归及其在遗传通路选择中的应用
Cancer Inform. 2016 Sep 15;15:179-87. doi: 10.4137/CIN.S40043. eCollection 2016.