Department of Mathematics, Shanghai Normal University, Shanghai 200234, China.
School of Science, East China University of Technology, Nanchang, Jiangxi 330013, China.
Bioinformatics. 2017 Sep 1;33(17):2651-2657. doi: 10.1093/bioinformatics/btx303.
Tumor sample classification has long been an important task in cancer research. Classifying tumors into different subtypes greatly benefits therapeutic development and facilitates application of precision medicine on patients. In practice, solid tumor tissue samples obtained from clinical settings are always mixtures of cancer and normal cells. Thus, the data obtained from these samples are mixed signals. The 'tumor purity', or the percentage of cancer cells in cancer tissue sample, will bias the clustering results if not properly accounted for.
In this article, we developed a model-based clustering method and an R function which uses DNA methylation microarray data to infer tumor subtypes with the consideration of tumor purity. Simulation studies and the analyses of The Cancer Genome Atlas data demonstrate improved results compared with existing methods.
InfiniumClust is part of R package InfiniumPurify , which is freely available from CRAN ( https://cran.r-project.org/web/packages/InfiniumPurify/index.html ).
hao.wu@emory.edu or xqzheng@shnu.edu.cn.
Supplementary data are available at Bioinformatics online.
肿瘤样本分类一直是癌症研究中的一项重要任务。将肿瘤分为不同的亚型对治疗的发展有很大的帮助,并有利于将精准医疗应用于患者。在实践中,从临床环境中获得的实体肿瘤组织样本通常是癌症细胞和正常细胞的混合物。因此,如果不加以适当考虑,从这些样本中获得的数据就是混合信号。如果不考虑“肿瘤纯度”(即癌症组织样本中癌细胞的百分比),它会对聚类结果产生偏差。
在本文中,我们开发了一种基于模型的聚类方法和一个 R 函数,该函数使用 DNA 甲基化微阵列数据来推断肿瘤亚型,并考虑了肿瘤纯度。模拟研究和对癌症基因组图谱数据的分析表明,与现有方法相比,该方法的结果得到了改善。
InfiniumClust 是 R 包 InfiniumPurify 的一部分,可从 CRAN(https://cran.r-project.org/web/packages/InfiniumPurify/index.html)免费获得。
hao.wu@emory.edu 或 xqzheng@shnu.edu.cn。
补充数据可在 Bioinformatics 在线获取。