Bahado-Singh Ray, Vlachos Kyriacos T, Aydas Buket, Gordevicius Juozas, Radhakrishna Uppala, Vishweswaraiah Sangeetha
Department of Obstetrics and Gynecology, Oakland University William Beaumont School of Medicine, Royal Oak, MI, United States.
Department of Biomedical Sciences, Wayne State School of Medicine, Basic Medical Sciences, Detroit, MI, United States.
Front Oncol. 2022 May 4;12:790645. doi: 10.3389/fonc.2022.790645. eCollection 2022.
Lung cancer (LC) is a leading cause of cancer-deaths globally. Its lethality is due in large part to the paucity of accurate screening markers. Precision Medicine includes the use of omics technology and novel analytic approaches for biomarker development. We combined Artificial Intelligence (AI) and DNA methylation analysis of circulating cell-free tumor DNA (ctDNA), to identify putative biomarkers for and to elucidate the pathogenesis of LC.
Illumina Infinium MethylationEPIC BeadChip array analysis was used to measure cytosine (CpG) methylation changes across the genome in LC. Six different AI platforms including support vector machine (SVM) and Deep Learning (DL) were used to identify CpG biomarkers and for LC detection. Training set and validation sets were generated, and 10-fold cross validation performed. Gene enrichment analysis using g:profiler and GREAT enrichment was used to elucidate the LC pathogenesis.
Using a stringent GWAS significance threshold, p-value <5x10, we identified 4389 CpGs (cytosine methylation loci) in coding genes and 1812 CpGs in non-protein coding DNA regions that were differentially methylated in LC. SVM and three other AI platforms achieved an AUC=1.00; 95% CI (0.90-1.00) for LC detection. DL achieved an AUC=1.00; 95% CI (0.95-1.00) and 100% sensitivity and specificity. High diagnostic accuracies were achieved with only intragenic or only intergenic CpG loci. Gene enrichment analysis found dysregulation of molecular pathways involved in the development of small cell and non-small cell LC.
Using AI and DNA methylation analysis of ctDNA, high LC detection rates were achieved. Further, many of the genes that were epigenetically altered are known to be involved in the biology of neoplasms in general and lung cancer in particular.
肺癌(LC)是全球癌症死亡的主要原因。其致死率在很大程度上归因于缺乏准确的筛查标志物。精准医学包括使用组学技术和新颖的分析方法来开发生物标志物。我们将人工智能(AI)与循环游离肿瘤DNA(ctDNA)的DNA甲基化分析相结合,以识别肺癌的潜在生物标志物并阐明其发病机制。
使用Illumina Infinium MethylationEPIC BeadChip芯片分析来测量肺癌基因组中胞嘧啶(CpG)的甲基化变化。使用包括支持向量机(SVM)和深度学习(DL)在内的六种不同的AI平台来识别CpG生物标志物并进行肺癌检测。生成训练集和验证集,并进行10倍交叉验证。使用g:profiler进行基因富集分析和使用GREAT富集来阐明肺癌发病机制。
使用严格的全基因组关联研究(GWAS)显著性阈值,p值<5×10⁻⁸,我们在编码基因中鉴定出4389个CpG(胞嘧啶甲基化位点),在非蛋白质编码DNA区域中鉴定出1812个CpG,它们在肺癌中存在差异甲基化。SVM和其他三个AI平台在肺癌检测中实现了AUC = 1.00;95%置信区间(0.90 - 1.00)。DL实现了AUC = 1.00;95%置信区间(0.95 - 1.00)以及100%的敏感性和特异性。仅使用基因内或仅基因间的CpG位点就实现了高诊断准确性。基因富集分析发现参与小细胞肺癌和非小细胞肺癌发生的分子途径失调。
通过对ctDNA进行AI和DNA甲基化分析,实现了高肺癌检测率。此外,许多发生表观遗传改变的基因通常已知参与肿瘤生物学过程,尤其是肺癌。