Na Woong, Lee Sung Hak, Lee Seunghee, Kim Jong-Seok, Han Seung Yun, Kim Yong Min, Kwon Mihye, Song Young Soo
Department of Pathology, H Plus Yangji Hospital, Seoul, South Korea.
Department of Hospital Pathology, College of Medicine, The Catholic University of Korea, Seoul, South Korea.
Medicine (Baltimore). 2024 Dec 27;103(52):e41134. doi: 10.1097/MD.0000000000041134.
Despite similarities in microsatellite instability (MSI) between colon and endometrial cancer, there are many clinically important organ-specific features. The molecular differences between these 2 MSI cancers are underexplored because the usual differentially expressed gene analysis yields too many noncancer-specific normally expressed genes. We aimed to identify cancer-specific genes in MSI colorectal adenocarcinoma (CRC) and MSI endometrial carcinoma (ECs) using a modified partial least squares discriminant analysis. We obtained a list of cancer-specific genes in MSI CRC and EC by taking the intersection of the genes obtained from tumor samples and normal samples. Specifically, we obtained publically available 1319 RNA sequencing data consisting of MSI CRCs, MSI ECs, normal colon including the rectum, and normal endometrium from The Cancer Genome Atlas and genome-tissue expression sites. To reduce gene-centric dimensions, we retained only 3924 genes from the original data by performing the usual differentially expressed gene screening for tumor samples using DESeq2. The usual partial least squares discriminant analysis was performed for tumor samples, producing 625 genes, whereas for normal samples, projection vectors with zero covariance were sampled, their weights were square-summed, and genes with sufficiently high values were selected. Gene ontology (GO) term enrichment, protein-protein interaction, and survival analyses were performed for functional and clinical validation. We identified 30 cancer-specific normal-invariant genes, including Zic family members (ZIC1, ZIC4, and ZIC5), DPPA2, PRSS56, ELF5, and FGF18, most of which were cancer-associated genes. Although no statistically significant GO terms were identified in the GO term enrichment analysis, cell differentiation was observed as potentially significant. In the protein-protein interaction analysis, 17 of the 30 genes had at least one connection, and when first-degree neighbors were added to the network, many cancer-related pathways, including MAPK, Ras, and PI3K-Akt, were enriched. In the survival analysis, 16 genes showed statistically significant differences between the lower and higher expression groups (3 in CRCs and 15 ECs). We developed a novel approach for selecting cancer-specific normal-invariant genes from relevant gene expression data. Although we believe that tissue-specific reactivation of embryonic genes might explain the cancer-specific differences of MSI CRC and EC, further studies are needed for validation.
尽管结肠癌和子宫内膜癌在微卫星不稳定性(MSI)方面存在相似性,但仍有许多临床上重要的器官特异性特征。这两种MSI癌症之间的分子差异尚未得到充分研究,因为常规的差异表达基因分析会产生过多非癌症特异性的正常表达基因。我们旨在使用改良的偏最小二乘判别分析来鉴定MSI结直肠癌(CRC)和MSI子宫内膜癌(EC)中的癌症特异性基因。通过取肿瘤样本和正常样本中获得的基因的交集,我们得到了MSI CRC和EC中癌症特异性基因的列表。具体而言,我们从癌症基因组图谱和基因组-组织表达位点获得了公开可用的1319个RNA测序数据,包括MSI CRC、MSI EC、包括直肠在内的正常结肠以及正常子宫内膜。为了减少以基因为中心的维度,我们使用DESeq2对肿瘤样本进行常规的差异表达基因筛选,仅从原始数据中保留了3924个基因。对肿瘤样本进行了常规的偏最小二乘判别分析,产生了625个基因,而对于正常样本,对协方差为零的投影向量进行采样,将其权重平方求和,并选择具有足够高值的基因。进行了基因本体(GO)术语富集、蛋白质-蛋白质相互作用和生存分析以进行功能和临床验证。我们鉴定出30个癌症特异性正常不变基因,包括Zic家族成员(ZIC1、ZIC4和ZIC5)、DPPA2、PRSS56、ELF5和FGF18,其中大多数是癌症相关基因。尽管在GO术语富集分析中未鉴定出具有统计学意义的GO术语,但观察到细胞分化可能具有重要意义。在蛋白质-蛋白质相互作用分析中,30个基因中的17个至少有一个连接,当将一度邻居添加到网络中时,许多癌症相关途径,包括MAPK、Ras和PI3K-Akt,都得到了富集。在生存分析中,16个基因在低表达组和高表达组之间显示出统计学上的显著差异(CRC中有3个,EC中有15个)。我们开发了一种从相关基因表达数据中选择癌症特异性正常不变基因的新方法。尽管我们认为胚胎基因的组织特异性重新激活可能解释了MSI CRC和EC的癌症特异性差异,但仍需要进一步研究进行验证。