采用提出的稳健非参数多维归一化方法对基因表达数据进行仿射变换的方法学研究。

Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method.

作者信息

Bengtsson Henrik, Hössjer Ola

机构信息

Mathematical Statistics, Centre for Mathematical Sciences, Lund University, Box 118, SE-221 00 Lund, Sweden.

出版信息

BMC Bioinformatics. 2006 Mar 1;7:100. doi: 10.1186/1471-2105-7-100.

DOI:10.1186/1471-2105-7-100

PMID:16509971

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1534066/

Abstract

BACKGROUND

Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general.

RESULTS

A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are revisited in the light of the affine model and their strengths and weaknesses are investigated in this context. As a direct result from this study, we propose a robust non-parametric multi-dimensional affine normalization method, which can be applied to any number of microarrays with any number of channels either individually or all at once. A high-quality cDNA microarray data set with spike-in controls is used to demonstrate the power of the affine model and the proposed normalization method.

CONCLUSION

We find that an affine model can explain non-linear intensity-dependent systematic effects in observed log-ratios. Affine normalization removes such artifacts for non-differentially expressed genes and assures that symmetry between negative and positive log-ratios is obtained, which is fundamental when identifying differentially expressed genes. In addition, affine normalization makes the empirical distributions in different channels more equal, which is the purpose of quantile normalization, and may also explain why dye-swap normalization works or fails. All methods are made available in the aroma package, which is a platform-independent package for R.

摘要

背景

微阵列数据的低水平处理和标准化是微阵列分析中最重要的步骤，对下游分析有着深远影响。迄今为止已提出多种方法，但尚不清楚哪种方法最佳。因此，详细研究不同的标准化方法以及微阵列数据的一般性质非常重要。

结果

开展了一项针对基因表达数据仿射模型的方法学研究。重点是双通道比较研究，但研究结果也适用于单通道和多通道数据。讨论适用于点阵式以及原位合成的微阵列数据。根据仿射模型重新审视了现有的标准化方法，如曲线拟合（“局部加权回归”）标准化、平行和垂直平移标准化、分位数标准化以及染料交换标准化，并在此背景下研究了它们的优缺点。作为本研究的直接成果，我们提出了一种稳健的非参数多维仿射标准化方法，该方法可单独或一次性应用于任意数量通道的任意数量微阵列。使用一个带有内参对照的高质量cDNA微阵列数据集来证明仿射模型和所提出的标准化方法的功效。

结论

我们发现仿射模型可以解释观察到的对数比值中与强度相关的非线性系统效应。仿射标准化消除了非差异表达基因的此类假象，并确保获得正负对数比值之间的对称性，这在识别差异表达基因时至关重要。此外，仿射标准化使不同通道中的经验分布更加均等，这是分位数标准化的目的，也可能解释了染料交换标准化为何有效或无效。所有方法都可在aroma软件包中获取，该软件包是一个与平台无关的R软件包。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c55/1534066/9b6156fccfd0/1471-2105-7-100-1.jpg

相似文献

Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method.采用提出的稳健非参数多维归一化方法对基因表达数据进行仿射变换的方法学研究。

BMC Bioinformatics. 2006 Mar 1;7:100. doi: 10.1186/1471-2105-7-100.

A robust two-way semi-linear model for normalization of cDNA microarray data.一种用于cDNA微阵列数据标准化的稳健双向半线性模型。

BMC Bioinformatics. 2005 Jan 21;6:14. doi: 10.1186/1471-2105-6-14.

Can Zipf's law be adapted to normalize microarrays?齐普夫定律能否用于对微阵列进行标准化？

BMC Bioinformatics. 2005 Feb 23;6:37. doi: 10.1186/1471-2105-6-37.

Use of normalization methods for analysis of microarrays containing a high degree of gene effects.使用标准化方法分析含有高度基因效应的微阵列。

BMC Bioinformatics. 2008 Nov 28;9:505. doi: 10.1186/1471-2105-9-505.

The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison.在Expresso和TM4中鉴定差异表达基因的统计：一项比较。

BMC Bioinformatics. 2006 Apr 20;7:215. doi: 10.1186/1471-2105-7-215.

A non-transformation method for identifying differentially expressed genes from cDNA microarrays.一种从cDNA微阵列中鉴定差异表达基因的非转化方法。

Yi Chuan Xue Bao. 2006 Jan;33(1):80-8. doi: 10.1016/S0379-4172(06)60012-7.

Evaluation of normalization methods for cDNA microarray data by k-NN classification.通过k近邻分类评估cDNA微阵列数据的标准化方法

BMC Bioinformatics. 2005 Jul 26;6:191. doi: 10.1186/1471-2105-6-191.

A new non-linear normalization method for reducing variability in DNA microarray experiments.一种用于减少DNA微阵列实验变异性的新型非线性归一化方法。

Genome Biol. 2002 Aug 30;3(9):research0048. doi: 10.1186/gb-2002-3-9-research0048.

Systematic variation normalization in microarray data to get gene expression comparison unbiased.对微阵列数据进行系统变异归一化，以使基因表达比较不受偏倚。

J Bioinform Comput Biol. 2005 Apr;3(2):225-41. doi: 10.1142/s0219720005001028.

Normal uniform mixture differential gene expression detection for cDNA microarrays.用于cDNA微阵列的正常均匀混合物差异基因表达检测

BMC Bioinformatics. 2005 Jul 12;6:173. doi: 10.1186/1471-2105-6-173.

引用本文的文献

Archetypal transcriptional blocks underpin yeast gene regulation in response to changes in growth conditions.典型的转录模块为酵母基因调控提供了基础，以响应生长条件的变化。

Sci Rep. 2018 May 21;8(1):7949. doi: 10.1038/s41598-018-26170-5.

A New Distribution Family for Microarray Data.一种用于微阵列数据的新分布族。

Microarrays (Basel). 2017 Feb 10;6(1):5. doi: 10.3390/microarrays6010005.

miRNA profiling of human naive CD4 T cells links miR-34c-5p to cell activation and HIV replication.人类初始CD4 T细胞的微小RNA分析将miR-34c-5p与细胞活化及HIV复制联系起来。

EMBO J. 2017 Feb 1;36(3):346-360. doi: 10.15252/embj.201694335. Epub 2016 Dec 19.

Empirical estimation of sequencing error rates using smoothing splines.使用平滑样条对测序错误率进行经验估计。

BMC Bioinformatics. 2016 Apr 22;17:177. doi: 10.1186/s12859-016-1052-3.

MicroRNA expression as risk biomarker of breast cancer metastasis: a pilot retrospective case-cohort study.微小RNA表达作为乳腺癌转移的风险生物标志物：一项前瞻性回顾性病例队列研究。

BMC Cancer. 2014 Oct 2;14:739. doi: 10.1186/1471-2407-14-739.

Genomic islands as a marker to differentiate between clinical and environmental Burkholderia pseudomallei.基因组岛作为区分临床和环境型伯克霍尔德菌属假单胞菌的标记。

PLoS One. 2012;7(6):e37762. doi: 10.1371/journal.pone.0037762. Epub 2012 Jun 1.

Genome-wide expression quantitative trait loci (eQTL) analysis in maize.玉米全基因组表达数量性状基因座(eQTL)分析。

BMC Genomics. 2011 Jun 30;12:336. doi: 10.1186/1471-2164-12-336.

Evolution combined with genomic study elucidates genetic bases of isobutanol tolerance in Escherichia coli.进化与基因组研究阐明了大肠杆菌中异丁醇耐受性的遗传基础。

Microb Cell Fact. 2011 Mar 25;10:18. doi: 10.1186/1475-2859-10-18.

TumorBoost: normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays.TumorBoost：从一对肿瘤-正常基因分型微阵列中对肿瘤等位基因特异性拷贝数进行标准化。

BMC Bioinformatics. 2010 May 12;11:245. doi: 10.1186/1471-2105-11-245.

A single-sample method for normalizing and combining full-resolution copy numbers from multiple platforms, labs and analysis methods.一种用于对来自多个平台、实验室及分析方法的全分辨率拷贝数进行归一化和合并的单样本方法。

Bioinformatics. 2009 Apr 1;25(7):861-7. doi: 10.1093/bioinformatics/btp074. Epub 2009 Feb 4.

本文引用的文献

Transformations for cDNA microarray data.cDNA微阵列数据的转换

Stat Appl Genet Mol Biol. 2003;2:Article4. doi: 10.2202/1544-6115.1009. Epub 2003 Jun 18.

Microarray image analysis: background estimation using quantile and morphological filters.微阵列图像分析：使用分位数和形态学滤波器进行背景估计。

BMC Bioinformatics. 2006 Feb 28;7:96. doi: 10.1186/1471-2105-7-96.

Microarray scanner calibration curves: characteristics and implications.微阵列扫描仪校准曲线：特征与影响

BMC Bioinformatics. 2005 Jul 15;6 Suppl 2(Suppl 2):S11. doi: 10.1186/1471-2105-6-S2-S11.

Calibration and assessment of channel-specific biases in microarray data with extended dynamical range.具有扩展动态范围的微阵列数据中通道特异性偏差的校准与评估。

BMC Bioinformatics. 2004 Nov 12;5:177. doi: 10.1186/1471-2105-5-177.

Human neuroblastoma cells exposed to hypoxia: induction of genes associated with growth, survival, and aggressive behavior.暴露于低氧环境的人神经母细胞瘤细胞：与生长、存活及侵袭性行为相关基因的诱导。

Exp Cell Res. 2004 May 1;295(2):469-87. doi: 10.1016/j.yexcr.2004.01.013.

Spotted long oligonucleotide arrays for human gene expression analysis.用于人类基因表达分析的斑点长寡核苷酸阵列。

Genome Res. 2003 Jul;13(7):1775-85. doi: 10.1101/gr.1048803. Epub 2003 Jun 12.

Approximate variance-stabilizing transformations for gene-expression microarray data.基因表达微阵列数据的近似方差稳定变换

Bioinformatics. 2003 May 22;19(8):966-72. doi: 10.1093/bioinformatics/btg107.

Contrast normalization of oligonucleotide arrays.寡核苷酸阵列的对比度归一化

J Comput Biol. 2003;10(1):95-102. doi: 10.1089/106652703763255697.

A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.基于方差和偏差的高密度寡核苷酸阵列数据标准化方法比较

Bioinformatics. 2003 Jan 22;19(2):185-93. doi: 10.1093/bioinformatics/19.2.185.

BioArray Software Environment (BASE): a platform for comprehensive management and analysis of microarray data.生物芯片软件环境（BASE）：一个用于微阵列数据综合管理与分析的平台。

Genome Biol. 2002 Jul 15;3(8):SOFTWARE0003. doi: 10.1186/gb-2002-3-8-software0003.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

采用提出的稳健非参数多维归一化方法对基因表达数据进行仿射变换的方法学研究。

Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献