用于回归的平均基因表达。

Averaged gene expressions for regression.

作者信息

Park Mee Young, Hastie Trevor, Tibshirani Robert

机构信息

Google Inc, Mountain View, CA 94043, USA.

出版信息

Biostatistics. 2007 Apr;8(2):212-27. doi: 10.1093/biostatistics/kxl002. Epub 2006 May 11.

DOI:10.1093/biostatistics/kxl002

PMID:16698769

Abstract

Although averaging is a simple technique, it plays an important role in reducing variance. We use this essential property of averaging in regression of the DNA microarray data, which poses the challenge of having far more features than samples. In this paper, we introduce a two-step procedure that combines (1) hierarchical clustering and (2) Lasso. By averaging the genes within the clusters obtained from hierarchical clustering, we define supergenes and use them to fit regression models, thereby attaining concise interpretation and accuracy. Our methods are supported with theoretical justifications and demonstrated on simulated and real data sets.

摘要

虽然均值法是一种简单的技术，但它在降低方差方面起着重要作用。我们在DNA微阵列数据回归中利用均值法的这一基本特性，该数据面临着特征数量远多于样本数量的挑战。在本文中，我们介绍了一种两步法，该方法结合了（1）层次聚类和（2）套索法。通过对层次聚类得到的簇内基因求均值，我们定义了超基因，并使用它们来拟合回归模型，从而实现简洁的解释和准确性。我们的方法有理论依据支持，并在模拟数据集和真实数据集上得到了验证。

相似文献

Averaged gene expressions for regression.用于回归的平均基因表达。

Biostatistics. 2007 Apr;8(2):212-27. doi: 10.1093/biostatistics/kxl002. Epub 2006 May 11.

Model-based clustering on the unit sphere with an illustration using gene expression profiles.基于模型的单位球面上的聚类，并通过基因表达谱进行说明。

Biostatistics. 2008 Jan;9(1):66-80. doi: 10.1093/biostatistics/kxm012. Epub 2007 Apr 27.

A simple and robust algorithm for microarray data clustering based on gene population-variance ratio metric.一种基于基因群体方差比度量的简单且稳健的微阵列数据聚类算法。

Biotechnol J. 2009 Sep;4(9):1357-61. doi: 10.1002/biot.200800219.

The clustering of regression models method with applications in gene expression data.回归模型聚类方法及其在基因表达数据中的应用

Biometrics. 2006 Jun;62(2):526-33. doi: 10.1111/j.1541-0420.2005.00498.x.

Clustering threshold gradient descent regularization: with applications to microarray studies.聚类阈值梯度下降正则化：及其在微阵列研究中的应用

Bioinformatics. 2007 Feb 15;23(4):466-72. doi: 10.1093/bioinformatics/btl632. Epub 2006 Dec 20.

Variable selection for model-based high-dimensional clustering and its application to microarray data.基于模型的高维聚类的变量选择及其在微阵列数据中的应用。

Biometrics. 2008 Jun;64(2):440-8. doi: 10.1111/j.1541-0420.2007.00922.x. Epub 2007 Oct 26.

Dimension reduction for classification with gene expression microarray data.利用基因表达微阵列数据进行分类的降维方法。

Stat Appl Genet Mol Biol. 2006;5:Article6. doi: 10.2202/1544-6115.1147. Epub 2006 Feb 24.

A novel approach for discovering overlapping clusters in gene expression data.一种在基因表达数据中发现重叠簇的新方法。

IEEE Trans Biomed Eng. 2009 Jul;56(7):1803-9. doi: 10.1109/TBME.2009.2015055. Epub 2009 Feb 20.

Techniques for clustering gene expression data.基因表达数据聚类技术。

Comput Biol Med. 2008 Mar;38(3):283-93. doi: 10.1016/j.compbiomed.2007.11.001. Epub 2007 Dec 3.

Reducing microarray data via nonnegative matrix factorization for visualization and clustering analysis.通过非负矩阵分解减少微阵列数据以进行可视化和聚类分析。

J Biomed Inform. 2008 Aug;41(4):602-6. doi: 10.1016/j.jbi.2007.12.003. Epub 2007 Dec 23.

引用本文的文献

Groupwise structural sparsity for discriminative voxels identification.用于鉴别体素识别的逐组结构稀疏性

Front Neurosci. 2023 Sep 7;17:1247315. doi: 10.3389/fnins.2023.1247315. eCollection 2023.

A Robust Personalized Classification Method for Breast Cancer Metastasis Prediction.一种用于乳腺癌转移预测的稳健个性化分类方法。

Cancers (Basel). 2022 Oct 29;14(21):5327. doi: 10.3390/cancers14215327.

Predicting the pathogenicity of bacterial genomes using widely spread protein families.利用广泛分布的蛋白质家族预测细菌基因组的致病性。

BMC Bioinformatics. 2022 Jun 24;23(1):253. doi: 10.1186/s12859-022-04777-w.

Robust edge-based biomarker discovery improves prediction of breast cancer metastasis.基于稳健边缘的生物标志物发现可提高乳腺癌转移的预测能力。

BMC Bioinformatics. 2020 Sep 30;21(Suppl 14):359. doi: 10.1186/s12859-020-03692-2.

Fast computation of genome-metagenome interaction effects.基因组-宏基因组相互作用效应的快速计算

Algorithms Mol Biol. 2020 Jul 1;15:13. doi: 10.1186/s13015-020-00173-2. eCollection 2020.

Bayesian Hyper-LASSO Classification for Feature Selection with Application to Endometrial Cancer RNA-seq Data.贝叶斯超 LASSO 分类用于特征选择及其在子宫内膜癌 RNA-seq 数据中的应用。

Sci Rep. 2020 Jun 16;10(1):9747. doi: 10.1038/s41598-020-66466-z.

Comparative evaluation of network features for the prediction of breast cancer metastasis.网络特征在乳腺癌转移预测中的比较评估。

BMC Med Genomics. 2020 Apr 3;13(Suppl 5):40. doi: 10.1186/s12920-020-0676-3.

Significant random signatures reveals new biomarker for breast cancer.显著的随机特征揭示了乳腺癌的新生物标志物。

BMC Med Genomics. 2019 Nov 8;12(1):160. doi: 10.1186/s12920-019-0609-1.

Radiomics in nuclear medicine: robustness, reproducibility, standardization, and how to avoid data analysis traps and replication crisis.核医学中的放射组学：稳健性、可重复性、标准化，以及如何避免数据分析陷阱和再现性危机。

Eur J Nucl Med Mol Imaging. 2019 Dec;46(13):2638-2655. doi: 10.1007/s00259-019-04391-8. Epub 2019 Jun 25.

A data-driven interactome of synergistic genes improves network-based cancer outcome prediction.基于数据驱动的协同基因互作网络提高了基于网络的癌症预后预测能力。

PLoS Comput Biol. 2019 Feb 6;15(2):e1006657. doi: 10.1371/journal.pcbi.1006657. eCollection 2019 Feb.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于回归的平均基因表达。

Averaged gene expressions for regression.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献