Ren Min, Xu Midie, Chen Chen, Wei Ran, Yao Qianlan, Jia Liqing, Qi Peng, Wang Qifeng, Bai Qianming, Zhu Xiaoli, Wu Sheng, Xu Qinghua, Zhou Xiaoyan
Department of Pathology, Fudan University Shanghai Cancer Center, 270 Dong'an Road, Shanghai, 200032, China.
Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, China.
Clin Epigenetics. 2025 Jun 8;17(1):95. doi: 10.1186/s13148-025-01902-3.
Identification of the tissue of origin is fundamental for cancer treatment. However, squamous cell carcinomas from different sites lack representative histological and immunohistochemical features. This study aimed to identify mutational profiles and further establish a DNA methylation-based classification for squamous cell carcinoma and urothelial carcinoma. Samples of unambiguous squamous cell carcinomas and urothelial carcinomas were collected for targeted next-generation sequencing and mutational landscape analysis. Moreover, using Illumina methylation BeadChip data from public datasets and a local cohort, we developed a DNA methylation-based classifier utilizing the CatBoost algorithm to identify four common types of squamous cell carcinoma (lung, head and neck, esophagus, and cervix) as well as urothelial carcinoma.
The DNA mutational profiles of squamous cell carcinomas from different sites overlapped greatly, and there was no significant difference in tumor mutation burden or microsatellite status. On the basis of public datasets and analyses via various machine learning algorithms, a DNA methylation-based classification containing 106 features by the CatBoost algorithm was constructed and reached an accuracy of 98.79% (490/496) in the training set from PanCanAtlas datasets. The predictive accuracies of the methylation classification in the public validation set and local FUSCC validation set 1 with known primary were 86.96% (340/391) and 84.87% (101/119), respectively. The predictive accuracy for the primary samples (89.66%, 78/87) was obviously greater than that for the metastatic samples (71.88%, 23/32). FUSCC validation set 2 included ten complicated cancer of unknown primary (CUP) samples with squamous cell differentiation. When a well-established 90-gene expression assay was compared with the present classification, our methylation-based classification successfully classified two samples with no eligible RNA expression; the results for four sample were consistent with higher methylation prediction scores in three, and those for two samples were inconsistent. The methylation-based classification results of the remaining two samples were more compatible with the results of the clinical evaluation.
We successfully established a DNA methylation-based classification for squamous cell carcinomas (lung, head and neck, esophagus, and cervix) and urothelial carcinomas with outstanding diagnostic performance for the first time. This classification has high potential for clinical translation to address the dilemma of identifying the origin of squamous cell carcinoma of unknown primary.
确定肿瘤的组织来源是癌症治疗的基础。然而,不同部位的鳞状细胞癌缺乏具有代表性的组织学和免疫组化特征。本研究旨在确定突变谱,并进一步建立基于DNA甲基化的鳞状细胞癌和尿路上皮癌分类方法。收集明确的鳞状细胞癌和尿路上皮癌样本,进行靶向二代测序和突变图谱分析。此外,利用来自公共数据集和本地队列的Illumina甲基化BeadChip数据,我们开发了一种基于DNA甲基化的分类器,利用CatBoost算法来识别四种常见类型的鳞状细胞癌(肺癌、头颈癌、食管癌和宫颈癌)以及尿路上皮癌。
不同部位鳞状细胞癌的DNA突变谱有很大重叠,肿瘤突变负荷或微卫星状态无显著差异。基于公共数据集并通过各种机器学习算法进行分析,构建了一种基于DNA甲基化的分类方法,该方法通过CatBoost算法包含106个特征,在来自泛癌图谱数据集的训练集中准确率达到98.79%(490/496)。在具有已知原发灶的公共验证集和本地FUSCC验证集1中,甲基化分类的预测准确率分别为86.96%(340/391)和84.87%(101/119)。原发样本的预测准确率(89.66%,78/87)明显高于转移样本(71.88%,23/32)。FUSCC验证集2包括10例具有鳞状细胞分化的不明原发灶复杂癌(CUP)样本。当将一种成熟的90基因表达检测方法与当前分类方法进行比较时,我们基于甲基化的分类方法成功地对两个无合格RNA表达的样本进行了分类;四个样本的结果在三个样本中与更高的甲基化预测分数一致,另外两个样本的结果不一致。其余两个样本的基于甲基化的分类结果与临床评估结果更相符。
我们首次成功建立了一种基于DNA甲基化的鳞状细胞癌(肺癌、头颈癌、食管癌和宫颈癌)和尿路上皮癌分类方法,具有出色的诊断性能。该分类方法在临床转化方面具有很高的潜力,可解决不明原发灶鳞状细胞癌来源鉴定的难题。