Naeem Haroon, Wong Nicholas C, Chatterton Zac, Hong Matthew K H, Pedersen John S, Corcoran Niall M, Hovens Christopher M, Macintyre Geoff
NICTA Victoria Research Laboratory, Department of Electrical and Electronic Engineering, The University of Melbourne, Parkville, Victoria 3010, Australia.
BMC Genomics. 2014 Jan 22;15(1):51. doi: 10.1186/1471-2164-15-51.
The Illumina HumanMethylation450 BeadChip (HM450K) measures the DNA methylation of 485,512 CpGs in the human genome. The technology relies on hybridization of genomic fragments to probes on the chip. However, certain genomic factors may compromise the ability to measure methylation using the array such as single nucleotide polymorphisms (SNPs), small insertions and deletions (INDELs), repetitive DNA, and regions with reduced genomic complexity. Currently, there is no clear method or pipeline for determining which of the probes on the HM450K bead array should be retained for subsequent analysis in light of these issues.
We comprehensively assessed the effects of SNPs, INDELs, repeats and bisulfite induced reduced genomic complexity by comparing HM450K bead array results with whole genome bisulfite sequencing. We determined which CpG probes provided accurate or noisy signals. From this, we derived a set of high-quality probes that provide unadulterated measurements of DNA methylation.
Our method significantly reduces the risk of false discoveries when using the HM450K bead array, while maximising the power of the array to detect methylation status genome-wide. Additionally, we demonstrate the utility of our method through extraction of biologically relevant epigenetic changes in prostate cancer.
Illumina HumanMethylation450 芯片(HM450K)可检测人类基因组中485,512个CpG位点的DNA甲基化情况。该技术依赖于基因组片段与芯片上探针的杂交。然而,某些基因组因素可能会影响使用该芯片检测甲基化的能力,如单核苷酸多态性(SNP)、小插入和缺失(INDEL)、重复DNA以及基因组复杂度降低的区域。目前,鉴于这些问题,尚无明确的方法或流程来确定HM450K芯片上哪些探针应保留用于后续分析。
我们通过将HM450K芯片结果与全基因组亚硫酸氢盐测序进行比较,全面评估了SNP、INDEL、重复序列以及亚硫酸氢盐诱导的基因组复杂度降低的影响。我们确定了哪些CpG探针能提供准确或有噪声的信号。据此,我们得出了一组高质量的探针,可提供纯净的DNA甲基化测量结果。
我们的方法显著降低了使用HM450K芯片时出现假阳性结果的风险,同时最大限度地提高了芯片在全基因组范围内检测甲基化状态的能力。此外,我们通过提取前列腺癌中生物学相关的表观遗传变化,证明了我们方法的实用性。