a Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences , East China Normal University , Shanghai , China.
b National Center for International Research of Biological Targeting Diagnosis and Therapy, Guangxi Key Laboratory of Biological Targeting Diagnosis and Therapy Research, Collaborative Innovation Center for Targeting Tumor Diagnosis and Therapy , Guangxi Medical University , Nanning , China.
Epigenetics. 2019 Jan;14(1):67-80. doi: 10.1080/15592294.2019.1568178. Epub 2019 Jan 29.
DNA methylation status is closely associated with diverse diseases, and is generally more stable than gene expression, thus abnormal DNA methylation could be important biomarkers for tumor diagnosis, treatment and prognosis. However, the signatures regarding DNA methylation changes for pan-cancer diagnosis and prognosis are less explored. Here we systematically analyzed the genome-wide DNA methylation patterns in diverse TCGA cancers with machine learning. We identified seven CpG sites that could effectively discriminate tumor samples from adjacent normal tissue samples for 12 main cancers of TCGA (1216 samples, AUC > 0.99). Those seven potential diagnostic biomarkers were further validated in the other 9 different TCGA cancers and 4 independent datasets (AUC > 0.92). Three out of the seven CpG sites were correlated with cell division, DNA replication and cell cycle. We also identified 12 CpG sites that can effectively distinguish 26 different cancers (7605 samples), and the result was repeatable in independent datasets as well as two disparate tumors with metastases (micro-average AUC > 0.89). Furthermore, a series of potential signatures that could significantly predict the prognosis of tumor patients for 7 different cancer were identified via survival analysis (p-value < 1e-4). Collectively, DNA methylation patterns vary greatly between tumor and adjacent normal tissues, as well as among different types of cancers. Our identified signatures may aid the decision of clinical diagnosis and prognosis for pan-cancer and the potential cancer-specific biomarkers could be used to predict the primary site of metastatic breast and prostate cancers.
DNA 甲基化状态与多种疾病密切相关,通常比基因表达更为稳定,因此异常的 DNA 甲基化可能是肿瘤诊断、治疗和预后的重要生物标志物。然而,针对泛癌诊断和预后的 DNA 甲基化变化特征研究较少。在这里,我们使用机器学习系统地分析了 TCGA 多种癌症的全基因组 DNA 甲基化模式。我们鉴定了 7 个 CpG 位点,可有效区分 TCGA 12 种主要癌症的肿瘤样本和相邻正常组织样本(1216 个样本,AUC>0.99)。这 7 个潜在的诊断生物标志物在另外 9 种不同的 TCGA 癌症和 4 个独立数据集(AUC>0.92)中得到了进一步验证。这 7 个 CpG 位点中有 3 个与细胞分裂、DNA 复制和细胞周期有关。我们还鉴定了 12 个 CpG 位点,可以有效区分 26 种不同的癌症(7605 个样本),该结果在独立数据集以及两种具有转移的不同肿瘤中也是可重复的(微平均 AUC>0.89)。此外,通过生存分析还鉴定了一系列能够显著预测 7 种不同癌症肿瘤患者预后的潜在特征(p 值<1e-4)。总的来说,肿瘤和相邻正常组织之间以及不同类型癌症之间的 DNA 甲基化模式差异很大。我们鉴定的特征可能有助于泛癌的临床诊断和预后决策,并且潜在的癌症特异性生物标志物可用于预测转移性乳腺癌和前列腺癌的原发部位。