标准化有利于基于微阵列的分类。

Normalization benefits microarray-based classification.

作者信息

Hua Jianping, Balagurunathan Yoganand, Chen Yidong, Lowey James, Bittner Michael L, Xiong Zixiang, Suh Edward, Dougherty Edward R

机构信息

Computational Biology Division, Translational Genomics Research Institute, Phoenix, AZ 85004, USA.

出版信息

EURASIP J Bioinform Syst Biol. 2006;2006(1):43056. doi: 10.1155/BSB/2006/43056.

DOI:10.1155/BSB/2006/43056

PMID:18427588

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3171318/

Abstract

When using cDNA microarrays, normalization to correct labeling bias is a common preliminary step before further data analysis is applied, its objective being to reduce the variation between arrays. To date, assessment of the effectiveness of normalization has mainly been confined to the ability to detect differentially expressed genes. Since a major use of microarrays is the expression-based phenotype classification, it is important to evaluate microarray normalization procedures relative to classification. Using a model-based approach, we model the systemic-error process to generate synthetic gene-expression values with known ground truth. These synthetic expression values are subjected to typical normalization methods and passed through a set of classification rules, the objective being to carry out a systematic study of the effect of normalization on classification. Three normalization methods are considered: offset, linear regression, and Lowess regression. Seven classification rules are considered: 3-nearest neighbor, linear support vector machine, linear discriminant analysis, regular histogram, Gaussian kernel, perceptron, and multiple perceptron with majority voting. The results of the first three are presented in the paper, with the full results being given on a complementary website. The conclusion from the different experiment models considered in the study is that normalization can have a significant benefit for classification under difficult experimental conditions, with linear and Lowess regression slightly outperforming the offset method.

摘要

在使用cDNA微阵列时，在进行进一步数据分析之前，进行归一化以校正标记偏差是常见的初步步骤，其目的是减少阵列之间的差异。迄今为止，对归一化有效性的评估主要局限于检测差异表达基因的能力。由于微阵列的主要用途是基于表达的表型分类，因此相对于分类来评估微阵列归一化程序很重要。我们使用基于模型的方法，对系统误差过程进行建模，以生成具有已知真实情况的合成基因表达值。这些合成表达值经过典型的归一化方法处理，并通过一组分类规则，目的是对归一化对分类的影响进行系统研究。我们考虑了三种归一化方法：偏移、线性回归和局部加权散点平滑回归（Lowess回归）。我们考虑了七种分类规则：3-最近邻、线性支持向量机、线性判别分析、常规直方图、高斯核、感知器以及采用多数投票的多重感知器。本文给出了前三种方法的结果，完整结果在一个补充网站上提供。该研究中考虑的不同实验模型得出的结论是，在困难的实验条件下，归一化对分类可能有显著益处，线性回归和局部加权散点平滑回归略优于偏移方法。

相似文献

Normalization benefits microarray-based classification.

EURASIP J Bioinform Syst Biol. 2006;2006(1):43056. doi: 10.1155/BSB/2006/43056.

Evaluation of normalization methods for cDNA microarray data by k-NN classification.

BMC Bioinformatics. 2005 Jul 26;6:191. doi: 10.1186/1471-2105-6-191.

Optimal number of features as a function of sample size for various classification rules.

Bioinformatics. 2005 Apr 15;21(8):1509-15. doi: 10.1093/bioinformatics/bti171. Epub 2004 Nov 30.

A new non-linear normalization method for reducing variability in DNA microarray experiments.

Genome Biol. 2002 Aug 30;3(9):research0048. doi: 10.1186/gb-2002-3-9-research0048.

Using linear mixed models for normalization of cDNA microarrays.

Stat Appl Genet Mol Biol. 2007;6:Article 19. doi: 10.2202/1544-6115.1249. Epub 2007 Jul 26.

Direct Kernel Perceptron (DKP): ultra-fast kernel ELM-based classification with non-iterative closed-form weight calculation.

Neural Netw. 2014 Feb;50:60-71. doi: 10.1016/j.neunet.2013.11.002. Epub 2013 Nov 14.

Optimized LOWESS normalization parameter selection for DNA microarray data.

BMC Bioinformatics. 2004 Dec 9;5:194. doi: 10.1186/1471-2105-5-194.

Using generalized procrustes analysis (GPA) for normalization of cDNA microarray data.

BMC Bioinformatics. 2008 Jan 16;9:25. doi: 10.1186/1471-2105-9-25.

Two-stage normalization using background intensities in cDNA microarray data.

BMC Bioinformatics. 2004 Jul 21;5:97. doi: 10.1186/1471-2105-5-97.

Normalization of boutique two-color microarrays with a high proportion of differentially expressed probes.

Genome Biol. 2007;8(1):R2. doi: 10.1186/gb-2007-8-1-r2.

引用本文的文献

Radiological semantics discriminate clinically significant grade prostate cancer.

Cancer Imaging. 2019 Dec 3;19(1):81. doi: 10.1186/s40644-019-0272-y.

Quantitative Imaging features Improve Discrimination of Malignancy in Pulmonary nodules.

Sci Rep. 2019 Jun 12;9(1):8528. doi: 10.1038/s41598-019-44562-z.

Differential effects of selective frankincense (Ru Xiang) essential oil versus non-selective sandalwood (Tan Xiang) essential oil on cultured bladder cancer cells: a microarray and bioinformatics study.

Chin Med. 2014 Jul 2;9:18. doi: 10.1186/1749-8546-9-18. eCollection 2014.

Identification of novel autoantibodies for detection of malignant mesothelioma.

PLoS One. 2013 Aug 19;8(8):e72458. doi: 10.1371/journal.pone.0072458. eCollection 2013.

Elevated AKR1C3 expression promotes prostate cancer cell survival and prostate cell-mediated endothelial cell tube formation: implications for prostate cancer progression.

BMC Cancer. 2010 Dec 6;10:672. doi: 10.1186/1471-2407-10-672.

Internal standard-based analysis of microarray data. Part 1: analysis of differential gene expressions.

Nucleic Acids Res. 2009 Oct;37(19):6323-39. doi: 10.1093/nar/gkp706. Epub 2009 Aug 31.

Unique patterns of molecular profiling between human prostate cancer LNCaP and PC-3 cells.

Prostate. 2009 Jul 1;69(10):1077-90. doi: 10.1002/pros.20960.

Breast cancer cell lines contain functional cancer stem cells with metastatic capacity and a distinct molecular signature.

Cancer Res. 2009 Feb 15;69(4):1302-13. doi: 10.1158/0008-5472.CAN-08-2741. Epub 2009 Feb 3.

Using generalized procrustes analysis (GPA) for normalization of cDNA microarray data.

BMC Bioinformatics. 2008 Jan 16;9:25. doi: 10.1186/1471-2105-9-25.

本文引用的文献

Ratio-based decisions and the quantitative analysis of cDNA microarray images.

J Biomed Opt. 1997 Oct;2(4):364-74. doi: 10.1117/12.281504.

Optimal number of features as a function of sample size for various classification rules.

Bioinformatics. 2005 Apr 15;21(8):1509-15. doi: 10.1093/bioinformatics/bti171. Epub 2004 Nov 30.

Which is better for cDNA-microarray-based classification: ratios or direct intensities.

Bioinformatics. 2004 Nov 1;20(16):2513-20. doi: 10.1093/bioinformatics/bth272. Epub 2004 Sep 28.

Is cross-validation valid for small-sample microarray classification?

Bioinformatics. 2004 Feb 12;20(3):374-80. doi: 10.1093/bioinformatics/btg419.

Microarray data normalization and transformation.

Nat Genet. 2002 Dec;32 Suppl:496-501. doi: 10.1038/ng1032.

Ratio statistics of gene expression levels and applications to microarray data analysis.

Bioinformatics. 2002 Sep;18(9):1207-15. doi: 10.1093/bioinformatics/18.9.1207.

Normalizing DNA microarray data.

Curr Issues Mol Biol. 2002 Apr;4(2):57-64.

Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation.

Nucleic Acids Res. 2002 Feb 15;30(4):e15. doi: 10.1093/nar/30.4.e15.

Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects.

Nucleic Acids Res. 2001 Jun 15;29(12):2549-57. doi: 10.1093/nar/29.12.2549.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

标准化有利于基于微阵列的分类。

Normalization benefits microarray-based classification.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献