Zhang Naiqian, Wu Hua-Jun, Zhang Weiwei, Wang Jun, Wu Hao, Zheng Xiaoqi
Department of Mathematics, Shanghai Normal University, Shanghai 200234, China.
Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston 02215, MA, USA and.
Bioinformatics. 2015 Nov 1;31(21):3401-5. doi: 10.1093/bioinformatics/btv370. Epub 2015 Jun 25.
In cancer genomics research, one important problem is that the solid tissue sample obtained from clinical settings is always a mixture of cancer and normal cells. The sample mixture brings complication in data analysis and results in biased findings if not correctly accounted for. Estimating tumor purity is of great interest, and a number of methods have been developed using gene expression, copy number variation or point mutation data.
We discover that in cancer samples, the distributions of data from Illumina Infinium 450 k methylation microarray are highly correlated with tumor purities. We develop a simple but effective method to estimate purities from the microarray data. Analyses of the Cancer Genome Atlas lung cancer data demonstrate favorable performance of the proposed method.
The method is implemented in InfiniumPurify, which is freely available at https://bitbucket.org/zhengxiaoqi/infiniumpurify.
xqzheng@shnu.edu.cn or hao.wu@emory.edu
Supplementary data are available at Bioinformatics online.
在癌症基因组学研究中,一个重要问题是从临床环境中获取的实体组织样本总是癌症细胞和正常细胞的混合物。如果没有正确考虑样本混合情况,会给数据分析带来复杂性并导致有偏差的结果。估计肿瘤纯度备受关注,并且已经开发了许多使用基因表达、拷贝数变异或点突变数据的方法。
我们发现,在癌症样本中,来自Illumina Infinium 450k甲基化微阵列的数据分布与肿瘤纯度高度相关。我们开发了一种简单但有效的方法来从微阵列数据中估计纯度。对癌症基因组图谱肺癌数据的分析证明了所提出方法的良好性能。
该方法在InfiniumPurify中实现,可在https://bitbucket.org/zhengxiaoqi/infiniumpurify免费获取。
xqzheng@shnu.edu.cn或hao.wu@emory.edu
补充数据可在《生物信息学》在线获取。