• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于基因组数据分析的分组加法回归模型。

Group additive regression models for genomic data analysis.

作者信息

Luan Yihui, Li Hongzhe

机构信息

Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, PA 19104-6021, USA.

出版信息

Biostatistics. 2008 Jan;9(1):100-13. doi: 10.1093/biostatistics/kxm015. Epub 2007 May 18.

DOI:10.1093/biostatistics/kxm015
PMID:17513311
Abstract

One important problem in genomic research is to identify genomic features such as gene expression data or DNA single nucleotide polymorphisms (SNPs) that are related to clinical phenotypes. Often these genomic data can be naturally divided into biologically meaningful groups such as genes belonging to the same pathways or SNPs within genes. In this paper, we propose group additive regression models and a group gradient descent boosting procedure for identifying groups of genomic features that are related to clinical phenotypes. Our simulation results show that by dividing the variables into appropriate groups, we can obtain better identification of the group features that are related to the phenotypes. In addition, the prediction mean square errors are also smaller than the component-wise boosting procedure. We demonstrate the application of the methods to pathway-based analysis of microarray gene expression data of breast cancer. Results from analysis of a breast cancer microarray gene expression data set indicate that the pathways of metalloendopeptidases (MMPs) and MMP inhibitors, as well as cell proliferation, cell growth, and maintenance are important to breast cancer-specific survival.

摘要

基因组研究中的一个重要问题是识别与临床表型相关的基因组特征,如基因表达数据或DNA单核苷酸多态性(SNP)。通常,这些基因组数据可以自然地分为具有生物学意义的组,例如属于相同途径的基因或基因内的SNP。在本文中,我们提出了组加性回归模型和组梯度下降增强程序,用于识别与临床表型相关的基因组特征组。我们的模拟结果表明,通过将变量划分为适当的组,我们可以更好地识别与表型相关的组特征。此外,预测均方误差也小于逐分量增强程序。我们展示了这些方法在基于途径的乳腺癌微阵列基因表达数据分析中的应用。对一个乳腺癌微阵列基因表达数据集的分析结果表明,金属内肽酶(MMP)和MMP抑制剂的途径,以及细胞增殖、细胞生长和维持对乳腺癌特异性生存很重要。

相似文献

1
Group additive regression models for genomic data analysis.用于基因组数据分析的分组加法回归模型。
Biostatistics. 2008 Jan;9(1):100-13. doi: 10.1093/biostatistics/kxm015. Epub 2007 May 18.
2
Nonparametric pathway-based regression models for analysis of genomic data.用于基因组数据分析的基于非参数通路的回归模型。
Biostatistics. 2007 Apr;8(2):265-84. doi: 10.1093/biostatistics/kxl007. Epub 2006 Jun 13.
3
A Markov random field model for network-based analysis of genomic data.一种用于基于网络的基因组数据分析的马尔可夫随机场模型。
Bioinformatics. 2007 Jun 15;23(12):1537-44. doi: 10.1093/bioinformatics/btm129. Epub 2007 May 5.
4
Use of expression data and the CGEMS genome-wide breast cancer association study to identify genes that may modify risk in BRCA1/2 mutation carriers.利用表达数据和CGEMS全基因组乳腺癌关联研究来鉴定可能改变BRCA1/2突变携带者风险的基因。
Breast Cancer Res Treat. 2008 Nov;112(2):229-36. doi: 10.1007/s10549-007-9848-5. Epub 2007 Dec 20.
5
Identification of SNP interactions using logic regression.使用逻辑回归识别单核苷酸多态性(SNP)相互作用。
Biostatistics. 2008 Jan;9(1):187-98. doi: 10.1093/biostatistics/kxm024. Epub 2007 Jun 19.
6
Genomic characterization of multiple clinical phenotypes of cancer using multivariate linear regression models.使用多元线性回归模型对癌症多种临床表型进行基因组特征分析。
Bioinformatics. 2007 Mar 15;23(6):732-8. doi: 10.1093/bioinformatics/btl663. Epub 2007 Jan 18.
7
Genetic polymorphisms of matrix metalloproteinases in lung, breast and colorectal cancer.肺癌、乳腺癌和结直肠癌中基质金属蛋白酶的基因多态性
Clin Genet. 2008 Mar;73(3):197-211. doi: 10.1111/j.1399-0004.2007.00946.x. Epub 2007 Dec 29.
8
New pathway links from cancer-progression determinants to gene expression of matrix metalloproteinases in breast cancer cells.新途径将乳腺癌细胞中癌症进展决定因素与基质金属蛋白酶的基因表达联系起来。
J Cell Physiol. 2008 Dec;217(3):739-44. doi: 10.1002/jcp.21548.
9
[Prognostic molecular classification of breast cancers based on gene expression profiling].基于基因表达谱的乳腺癌预后分子分类
Zhonghua Zhong Liu Za Zhi. 2006 Dec;28(12):900-6.
10
Challenges in projecting clustering results across gene expression-profiling datasets.跨基因表达谱数据集预测聚类结果面临的挑战。
J Natl Cancer Inst. 2007 Nov 21;99(22):1715-23. doi: 10.1093/jnci/djm216. Epub 2007 Nov 13.

引用本文的文献

1
A statistical boosting framework for polygenic risk scores based on large-scale genotype data.基于大规模基因型数据的多基因风险评分的统计增强框架。
Front Genet. 2023 Jan 10;13:1076440. doi: 10.3389/fgene.2022.1076440. eCollection 2022.
2
Statistics in the Genomic Era.基因组时代的统计学。
Genes (Basel). 2020 Apr 18;11(4):443. doi: 10.3390/genes11040443.
3
A Pathway-Based Kernel Boosting Method for Sample Classification Using Genomic Data.基于通路的核提升方法在基因组数据样本分类中的应用。
Genes (Basel). 2019 Aug 31;10(9):670. doi: 10.3390/genes10090670.
4
Pathway aggregation for survival prediction via multiple kernel learning.通过多内核学习进行生存预测的途径聚合。
Stat Med. 2018 Jul 20;37(16):2501-2515. doi: 10.1002/sim.7681. Epub 2018 Apr 17.
5
IPI59: An Actionable Biomarker to Improve Treatment Response in Serous Ovarian Carcinoma Patients.IPI59:一种可改善浆液性卵巢癌患者治疗反应的可操作生物标志物。
Stat Biosci. 2017 Jun;9(1):1-12. doi: 10.1007/s12561-016-9144-1. Epub 2016 Mar 29.
6
A novel procedure for statistical inference and verification of gene regulatory subnetwork.一种用于基因调控子网统计推断和验证的新方法。
BMC Bioinformatics. 2015;16 Suppl 7(Suppl 7):S7. doi: 10.1186/1471-2105-16-S7-S7. Epub 2015 Apr 23.
7
Pathway-gene identification for pancreatic cancer survival via doubly regularized Cox regression.通过双重正则化Cox回归识别胰腺癌生存的通路基因
BMC Syst Biol. 2014;8 Suppl 1(Suppl 1):S3. doi: 10.1186/1752-0509-8-S1-S3. Epub 2014 Jan 24.
8
SBERIA: set-based gene-environment interaction test for rare and common variants in complex diseases.SBERIA:基于集合的基因-环境交互作用测试,用于复杂疾病中的罕见和常见变异。
Genet Epidemiol. 2013 Jul;37(5):452-64. doi: 10.1002/gepi.21735. Epub 2013 May 29.
9
Pathway index models for construction of patient-specific risk profiles.用于构建患者特定风险概况的途径指数模型。
Stat Med. 2013 Apr 30;32(9):1524-35. doi: 10.1002/sim.5641. Epub 2012 Oct 16.
10
Bayesian gene set analysis for identifying significant biological pathways.用于识别显著生物学通路的贝叶斯基因集分析
J R Stat Soc Ser C Appl Stat. 2011 Aug 1;60(4):541-557. doi: 10.1111/j.1467-9876.2011.00765.x.