• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用微阵列基因表达数据通过全主成分回归(TPCR)进行多类别癌症分类。

Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data.

作者信息

Tan Yongxi, Shi Leming, Tong Weida, Wang Charles

机构信息

Department of Medicine, Cedars-Sinai Medical Center, David Geffen School of Medicine UCLA, Los Angeles, CA 90048, USA.

出版信息

Nucleic Acids Res. 2005 Jan 7;33(1):56-65. doi: 10.1093/nar/gki144. Print 2005.

DOI:10.1093/nar/gki144
PMID:15640445
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC546133/
Abstract

DNA microarray technology provides a promising approach to the diagnosis and prognosis of tumors on a genome-wide scale by monitoring the expression levels of thousands of genes simultaneously. One problem arising from the use of microarray data is the difficulty to analyze the high-dimensional gene expression data, typically with thousands of variables (genes) and much fewer observations (samples), in which severe collinearity is often observed. This makes it difficult to apply directly the classical statistical methods to investigate microarray data. In this paper, total principal component regression (TPCR) was proposed to classify human tumors by extracting the latent variable structure underlying microarray data from the augmented subspace of both independent variables and dependent variables. One of the salient features of our method is that it takes into account not only the latent variable structure but also the errors in the microarray gene expression profiles (independent variables). The prediction performance of TPCR was evaluated by both leave-one-out and leave-half-out cross-validation using four well-known microarray datasets. The stabilities and reliabilities of the classification models were further assessed by re-randomization and permutation studies. A fast kernel algorithm was applied to decrease the computation time dramatically. (MATLAB source code is available upon request.).

摘要

DNA微阵列技术通过同时监测数千个基因的表达水平,为在全基因组范围内进行肿瘤的诊断和预后评估提供了一种很有前景的方法。使用微阵列数据产生的一个问题是难以分析高维基因表达数据,通常这些数据有成千个变量(基因)和少得多的观测值(样本),并且经常观察到严重的共线性。这使得直接应用经典统计方法来研究微阵列数据变得困难。在本文中,提出了全主成分回归(TPCR)方法,通过从自变量和因变量的增强子空间中提取微阵列数据背后的潜在变量结构来对人类肿瘤进行分类。我们方法的一个显著特点是它不仅考虑了潜在变量结构,还考虑了微阵列基因表达谱(自变量)中的误差。使用四个著名的微阵列数据集,通过留一法和留半法交叉验证对TPCR的预测性能进行了评估。通过重新随机化和置换研究进一步评估了分类模型的稳定性和可靠性。应用了一种快速核算法以显著减少计算时间。(可根据要求提供MATLAB源代码。)

相似文献

1
Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data.使用微阵列基因表达数据通过全主成分回归(TPCR)进行多类别癌症分类。
Nucleic Acids Res. 2005 Jan 7;33(1):56-65. doi: 10.1093/nar/gki144. Print 2005.
2
Multi-class tumor classification by discriminant partial least squares using microarray gene expression data and assessment of classification models.使用微阵列基因表达数据通过判别偏最小二乘法进行多类别肿瘤分类及分类模型评估
Comput Biol Chem. 2004 Jul;28(3):235-44. doi: 10.1016/j.compbiolchem.2004.05.002.
3
Independent component analysis-based penalized discriminant method for tumor classification using gene expression data.基于独立成分分析的惩罚判别方法用于利用基因表达数据进行肿瘤分类
Bioinformatics. 2006 Aug 1;22(15):1855-62. doi: 10.1093/bioinformatics/btl190. Epub 2006 May 18.
4
Multi-class cancer classification via partial least squares with gene expression profiles.基于基因表达谱的偏最小二乘法进行多类别癌症分类
Bioinformatics. 2002 Sep;18(9):1216-26. doi: 10.1093/bioinformatics/18.9.1216.
5
Simultaneous gene clustering and subset selection for sample classification via MDL.通过最小描述长度实现用于样本分类的同步基因聚类和子集选择
Bioinformatics. 2003 Jun 12;19(9):1100-9. doi: 10.1093/bioinformatics/btg039.
6
Improving gene expression cancer molecular pattern discovery using nonnegative principal component analysis.使用非负主成分分析改进基因表达癌症分子模式发现
Genome Inform. 2008;21:200-11.
7
Gene expression data classification using locally linear discriminant embedding.基于局部线性判别嵌入的基因表达数据分类。
Comput Biol Med. 2010 Oct;40(10):802-10. doi: 10.1016/j.compbiomed.2010.08.003. Epub 2010 Sep 22.
8
Dimension reduction for classification with gene expression microarray data.利用基因表达微阵列数据进行分类的降维方法。
Stat Appl Genet Mol Biol. 2006;5:Article6. doi: 10.2202/1544-6115.1147. Epub 2006 Feb 24.
9
Gene expression data classification using consensus independent component analysis.使用一致性独立成分分析的基因表达数据分类
Genomics Proteomics Bioinformatics. 2008 Jun;6(2):74-82. doi: 10.1016/S1672-0229(08)60022-4.
10
Tumor classification by partial least squares using microarray gene expression data.利用微阵列基因表达数据通过偏最小二乘法进行肿瘤分类。
Bioinformatics. 2002 Jan;18(1):39-50. doi: 10.1093/bioinformatics/18.1.39.

引用本文的文献

1
Network-based multi-class classifier to identify optimized gene networks for acute leukemia cell line classification.基于网络的多类分类器,用于识别急性白血病细胞系分类的优化基因网络。
PLoS One. 2025 May 8;20(5):e0321549. doi: 10.1371/journal.pone.0321549. eCollection 2025.
2
A Self-Training Subspace Clustering Algorithm under Low-Rank Representation for Cancer Classification on Gene Expression Data.基于低秩表示的自训练子空间聚类算法在基因表达数据癌症分类中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2018 Jul-Aug;15(4):1315-1324. doi: 10.1109/TCBB.2017.2712607. Epub 2017 Jun 6.
3
Semi-Supervised Projective Non-Negative Matrix Factorization for Cancer Classification.

本文引用的文献

1
Classification using partial least squares with penalized logistic regression.使用带有惩罚逻辑回归的偏最小二乘法进行分类。
Bioinformatics. 2005 Apr 1;21(7):1104-11. doi: 10.1093/bioinformatics/bti114. Epub 2004 Nov 5.
2
Multi-class tumor classification by discriminant partial least squares using microarray gene expression data and assessment of classification models.使用微阵列基因表达数据通过判别偏最小二乘法进行多类别肿瘤分类及分类模型评估
Comput Biol Chem. 2004 Jul;28(3):235-44. doi: 10.1016/j.compbiolchem.2004.05.002.
3
Multi-class cancer subtype classification based on gene expression signatures with reliability analysis.
用于癌症分类的半监督投影非负矩阵分解
PLoS One. 2015 Sep 22;10(9):e0138814. doi: 10.1371/journal.pone.0138814. eCollection 2015.
4
iPcc: a novel feature extraction method for accurate disease class discovery and prediction.iPcc:一种用于准确发现和预测疾病类别的新型特征提取方法。
Nucleic Acids Res. 2013 Aug;41(14):e143. doi: 10.1093/nar/gkt343. Epub 2013 Jun 12.
5
An ensemble method for predicting subnuclear localizations from primary protein structures.一种基于原始蛋白质结构预测亚核定位的集成方法。
PLoS One. 2013;8(2):e57225. doi: 10.1371/journal.pone.0057225. Epub 2013 Feb 27.
6
Transcriptomic profiling of human peritumoral neocortex tissues revealed genes possibly involved in tumor-induced epilepsy.人类瘤周新皮层组织转录组谱分析揭示了可能与肿瘤诱导性癫痫相关的基因。
PLoS One. 2013;8(2):e56077. doi: 10.1371/journal.pone.0056077. Epub 2013 Feb 13.
7
Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification.采用启发式广度优先搜索算法寻找最小基因子集进行稳健的肿瘤分类。
BMC Bioinformatics. 2012 Jul 25;13:178. doi: 10.1186/1471-2105-13-178.
8
Data integration in genetics and genomics: methods and challenges.遗传学与基因组学中的数据整合:方法与挑战
Hum Genomics Proteomics. 2009 Jan 12;2009:869093. doi: 10.4061/2009/869093.
9
Variability of DNA microarray gene expression profiles in cultured rat primary hepatocytes.培养的大鼠原代肝细胞中DNA微阵列基因表达谱的变异性
Gene Regul Syst Bio. 2007 Nov 18;1:235-49.
10
Using the ratio of means as the effect size measure in combining results of microarray experiments.在整合微阵列实验结果时,使用均值比作为效应量指标。
BMC Syst Biol. 2009 Nov 5;3:106. doi: 10.1186/1752-0509-3-106.
基于基因表达特征并进行可靠性分析的多类别癌症亚型分类
FEBS Lett. 2004 Mar 12;561(1-3):186-90. doi: 10.1016/S0014-5793(04)00175-9.
4
Analysis of variance components in gene expression data.基因表达数据中方差成分的分析。
Bioinformatics. 2004 Jun 12;20(9):1436-46. doi: 10.1093/bioinformatics/bth118. Epub 2004 Feb 12.
5
Linear regression and two-class classification with gene expression data.基于基因表达数据的线性回归和二分类
Bioinformatics. 2003 Nov 1;19(16):2072-8. doi: 10.1093/bioinformatics/btg283.
6
Diagnostic and prognostic prediction using gene expression profiles in high-dimensional microarray data.利用高维微阵列数据中的基因表达谱进行诊断和预后预测。
Br J Cancer. 2003 Nov 3;89(9):1599-604. doi: 10.1038/sj.bjc.6601326.
7
New gene selection method for classification of cancer subtypes considering within-class variation.考虑类内变异的癌症亚型分类新基因选择方法。
FEBS Lett. 2003 Sep 11;551(1-3):3-7. doi: 10.1016/s0014-5793(03)00819-6.
8
Classification of multiple cancer types by multicategory support vector machines using gene expression data.使用基因表达数据通过多类别支持向量机对多种癌症类型进行分类。
Bioinformatics. 2003 Jun 12;19(9):1132-9. doi: 10.1093/bioinformatics/btg102.
9
PCA disjoint models for multiclass cancer analysis using gene expression data.使用基因表达数据进行多类癌症分析的主成分分析(PCA)不相交模型。
Bioinformatics. 2003 Mar 22;19(5):571-8. doi: 10.1093/bioinformatics/btg051.
10
Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLS-DA) approach.利用微阵列数据预测临床结果:一种偏最小二乘判别分析(PLS-DA)方法。
Hum Genet. 2003 May;112(5-6):581-92. doi: 10.1007/s00439-003-0921-9. Epub 2003 Feb 27.