Suppr超能文献

双通道 ChIP-chip 和 DNA 甲基化芯片的归一化策略评估。

An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies.

机构信息

Department of Bioinformatics-BiGCaT, Maastricht University, Maastricht, The Netherlands.

出版信息

BMC Genomics. 2012 Jan 25;13:42. doi: 10.1186/1471-2164-13-42.

Abstract

BACKGROUND

The combination of chromatin immunoprecipitation with two-channel microarray technology enables genome-wide mapping of binding sites of DNA-interacting proteins (ChIP-on-chip) or sites with methylated CpG di-nucleotides (DNA methylation microarray). These powerful tools are the gateway to understanding gene transcription regulation. Since the goals of such studies, the sample preparation procedures, the microarray content and study design are all different from transcriptomics microarrays, the data pre-processing strategies traditionally applied to transcriptomics microarrays may not be appropriate. Particularly, the main challenge of the normalization of "regulation microarrays" is (i) to make the data of individual microarrays quantitatively comparable and (ii) to keep the signals of the enriched probes, representing DNA sequences from the precipitate, as distinguishable as possible from the signals of the un-enriched probes, representing DNA sequences largely absent from the precipitate.

RESULTS

We compare several widely used normalization approaches (VSN, LOWESS, quantile, T-quantile, Tukey's biweight scaling, Peng's method) applied to a selection of regulation microarray datasets, ranging from DNA methylation to transcription factor binding and histone modification studies. Through comparison of the data distributions of control probes and gene promoter probes before and after normalization, and assessment of the power to identify known enriched genomic regions after normalization, we demonstrate that there are clear differences in performance between normalization procedures.

CONCLUSION

T-quantile normalization applied separately on the channels and Tukey's biweight scaling outperform other methods in terms of the conservation of enriched and un-enriched signal separation, as well as in identification of genomic regions known to be enriched. T-quantile normalization is preferable as it additionally improves comparability between microarrays. In contrast, popular normalization approaches like quantile, LOWESS, Peng's method and VSN normalization alter the data distributions of regulation microarrays to such an extent that using these approaches will impact the reliability of the downstream analysis substantially.

摘要

背景

染色质免疫沉淀与双通道微阵列技术相结合,实现了 DNA 相互作用蛋白结合位点(ChIP-on-chip)或甲基化 CpG 二核苷酸(DNA 甲基化微阵列)的全基因组作图。这些强大的工具是理解基因转录调控的途径。由于这些研究的目标、样品制备程序、微阵列内容和研究设计都与转录组微阵列不同,因此传统上应用于转录组微阵列的数据预处理策略可能并不适用。特别是,“调控微阵列”标准化的主要挑战是:(i)使各个微阵列的数据在数量上具有可比性;(ii)使富集探针的信号尽可能与未富集探针的信号区分开来,后者代表沉淀中不存在的 DNA 序列。

结果

我们比较了几种广泛使用的归一化方法(VSN、LOWESS、分位数、T 分位数、Tukey 的双权重缩放、Peng 的方法)应用于从 DNA 甲基化到转录因子结合和组蛋白修饰研究的一系列调控微阵列数据集。通过比较归一化前后对照探针和基因启动子探针的数据分布,并评估归一化后识别已知富集基因组区域的能力,我们证明了归一化程序之间存在明显的性能差异。

结论

T 分位数归一化分别在两个通道上应用,以及 Tukey 的双权重缩放,在保留富集和未富集信号分离方面以及在识别已知富集的基因组区域方面表现优于其他方法。T 分位数归一化更可取,因为它还可以提高微阵列之间的可比性。相比之下,像分位数、LOWESS、Peng 的方法和 VSN 归一化这样流行的归一化方法会极大地改变调控微阵列的数据分布,以至于使用这些方法会极大地影响下游分析的可靠性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c8f/3293711/fb30d5aabe1c/1471-2164-13-42-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验