Cho Soonweng, Kim Hyun-Seok, Zeiger Martha A, Umbricht Christopher B, Cope Leslie M
1 Department of Psychiatry and Behavioral Sciences, The Johns Hopkins University School of Medicine, Baltimore, Maryland.
2 Department of Medicine, Rutgers New Jersey Medical School, New Brunswick, New Jersey.
J Comput Biol. 2019 Apr;26(4):295-304. doi: 10.1089/cmb.2018.0143. Epub 2019 Feb 21.
Genetic and epigenetic changes drive carcinogenesis, and their integrated analysis provides insights into mechanisms of cancer development. Computational methods have been developed to measure copy number variation (CNV) from methylation array data, including ChAMP-CNV, CN450K, and, introduced here, Epicopy. Using paired single nucleotide polymorphism (SNP) and methylation array data from the public The Cancer Genome Atlas repository, we optimized CNV calling and benchmarked the performance of these methods. We optimized the thresholds of all three methods and showed comparable performance across methods. Using Epicopy as a representative analysis of Illumina450K array, we show that Illumina450K-derived CNV methods achieve a sensitivity of 0.7 and a positive predictive value of 0.75 in identifying CNVs, which is similar to results achieved when comparing competing SNP microarray platforms with each other.
遗传和表观遗传变化驱动癌症发生,对它们的综合分析有助于深入了解癌症发展机制。已经开发出计算方法来从甲基化阵列数据中测量拷贝数变异(CNV),包括ChAMP-CNV、CN450K,以及本文介绍的Epicopy。利用来自公共癌症基因组图谱库的配对单核苷酸多态性(SNP)和甲基化阵列数据,我们优化了CNV检测,并对这些方法的性能进行了基准测试。我们优化了所有三种方法的阈值,并展示了各方法之间相当的性能。使用Epicopy作为Illumina450K阵列的代表性分析,我们表明,Illumina450K衍生的CNV方法在识别CNV时的灵敏度为0.7,阳性预测值为0.75,这与相互比较竞争SNP微阵列平台时所取得的结果相似。