Liu Pengfei
State Key Laboratory of Genetic Engineering, Collaborative Innovation Center For Genetics and Development, School of Life Sciences, Fudan University, Shanghai, China.
Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai, China.
Front Genet. 2022 Jan 6;12:798748. doi: 10.3389/fgene.2021.798748. eCollection 2021.
The metastatic cancer of unknown primary (CUP) sites remains a leading cause of cancer death with few therapeutic options. The aberrant DNA methylation (DNAm) is the most important risk factor for cancer, which has certain tissue specificity. However, how DNAm alterations in tumors differ among the regulatory network of multi-omics remains largely unexplored. Therefore, there is room for improvement in our accuracy in the prediction of tumor origin sites and a need for better understanding of the underlying mechanisms. In our study, an integrative analysis based on multi-omics data and molecular regulatory network uncovered genome-wide methylation mechanism and identified 23 epi-driver genes. Apart from the promoter region, we also found that the aberrant methylation within the gene body or intergenic region was significantly associated with gene expression. Significant enrichment analysis of the epi-driver genes indicated that these genes were highly related to cellular mechanisms of tumorigenesis, including T-cell differentiation, cell proliferation, and signal transduction. Based on the ensemble algorithm, six CpG sites located in five epi-driver genes were selected to construct a tissue-specific classifier with a better accuracy (>95%) using TCGA datasets. In the independent datasets and the metastatic cancer datasets from GEO, the accuracy of distinguishing tumor subtypes or original sites was more than 90%, showing better robustness and stability. In summary, the integration analysis of large-scale omics data revealed complex regulation of DNAm across various cancer types and identified the epi-driver genes participating in tumorigenesis. Based on the aberrant methylation status located in epi-driver genes, a classifier that provided the highest accuracy in tracing back to the primary sites of metastatic cancer was established. Our study provides a comprehensive and multi-omics view of DNAm-associated changes across cancer types and has potential for clinical application.
原发灶不明的转移性癌症(CUP)仍是癌症死亡的主要原因之一,治疗选择有限。异常DNA甲基化(DNAm)是癌症最重要的风险因素,具有一定的组织特异性。然而,肿瘤中的DNAm改变在多组学调控网络中的差异在很大程度上仍未得到探索。因此,我们在预测肿瘤起源部位的准确性方面仍有提升空间,并且需要更好地理解其潜在机制。在我们的研究中,基于多组学数据和分子调控网络的综合分析揭示了全基因组甲基化机制,并鉴定出23个表观驱动基因。除了启动子区域,我们还发现基因体内或基因间区域的异常甲基化与基因表达显著相关。对表观驱动基因的显著富集分析表明,这些基因与肿瘤发生的细胞机制高度相关,包括T细胞分化、细胞增殖和信号转导。基于集成算法,从五个表观驱动基因中选择了六个CpG位点,使用TCGA数据集构建了一个具有更高准确性(>95%)的组织特异性分类器。在来自GEO的独立数据集和转移性癌症数据集中,区分肿瘤亚型或原发部位的准确性超过90%,显示出更好的稳健性和稳定性。总之,大规模组学数据的整合分析揭示了不同癌症类型中DNAm的复杂调控,并鉴定出参与肿瘤发生的表观驱动基因。基于表观驱动基因中的异常甲基化状态,建立了一个在追溯转移性癌症原发部位方面准确性最高的分类器。我们的研究提供了一个全面的、多组学视角的跨癌症类型的DNAm相关变化,具有临床应用潜力。