Jo Yunju, Yeo Min-Kyung, Dao Tam, Kwon Jeongho, Yi Hyon-Seung, Ryu Dongryeol
Department of Molecular Cell Biology, Sungkyunkwan University (SKKU) School of Medicine, Suwon, South Korea.
Department of Pathology, Chungnam National University School of Medicine, Daejeon, South Korea.
Front Oncol. 2022 Aug 17;12:942774. doi: 10.3389/fonc.2022.942774. eCollection 2022.
Pancreatic cancer is one of the most fatal malignancies of the gastrointestinal cancer, with a challenging early diagnosis due to lack of distinctive symptoms and specific biomarkers. The exact etiology of pancreatic cancer is unknown, making the development of reliable biomarkers difficult. The accumulation of patient-derived omics data along with technological advances in artificial intelligence is giving way to a new era in the discovery of suitable biomarkers.
We performed machine learning (ML)-based modeling using four independent transcriptomic datasets, including GSE16515, GSE62165, GSE71729, and the pancreatic adenocarcinoma (PAC) dataset of the Cancer Genome Atlas. To find candidates for circulating biomarkers, we exported expression profiles of 1,703 genes encoding secretory proteins. Integrating three transcriptomic datasets into either a training or test set, ML-based modeling distinguishing PAC from normal was carried out. Another ML-model classifying long-lived and short-lived patients with PAC was also built to select prognosis-associated features. Finally, circulating level of SCG5 in the plasma was determined from the independent cohort (non-tumor = 25 and pancreatic cancer = 25). We also investigated the impact of SCG5 on adipocyte biology using recombinant protein.
Three distinctive ML-classifiers selected 29-, 64- and 18-featured genes, recognizing the only common gene, . As per the prediction of ML-models, the transcripts was significantly reduced in PAC and decreased further with the progression of the tumor, indicating its potential as a diagnostic as well as prognostic marker for PAC. External validation of SCG5 using plasma samples from patients with PAC confirmed that SCG5 was reduced significantly in patients with PAC when compared to controls. Interestingly, plasma SCG5 levels were correlated with the body mass index and age of donors, implying pancreas-originated SCG5 could regulate energy metabolism systemically. Additionally, analyses using publicly available Genotype-Tissue Expression datasets, including adipose tissue histology and pancreatic expression, further validated the association between pancreatic expression and the size of subcutaneous adipocytes in humans. However, we could not observe any definite effect of rSCG5 on the cultured adipocyte, in 2D culture.
Circulating SCG5, which may be associated with adipopenia, is a promising diagnostic biomarker for PAC.
胰腺癌是胃肠道癌症中最致命的恶性肿瘤之一,由于缺乏明显症状和特异性生物标志物,早期诊断具有挑战性。胰腺癌的确切病因尚不清楚,这使得可靠生物标志物的开发变得困难。患者来源的组学数据的积累以及人工智能技术的进步为发现合适的生物标志物开启了一个新时代。
我们使用四个独立的转录组数据集进行基于机器学习(ML)的建模,包括GSE16515、GSE62165、GSE71729以及癌症基因组图谱的胰腺腺癌(PAC)数据集。为了寻找循环生物标志物的候选物,我们导出了1703个编码分泌蛋白的基因的表达谱。将三个转录组数据集整合到训练集或测试集中,进行基于ML的区分PAC与正常样本的建模。还构建了另一个基于ML的模型来对PAC的长期生存和短期生存患者进行分类,以选择与预后相关的特征。最后,从独立队列(非肿瘤患者 = 25例,胰腺癌患者 = 25例)中测定血浆中SCG5的循环水平。我们还使用重组蛋白研究了SCG5对脂肪细胞生物学的影响。
三个独特的ML分类器分别选择了具有29个、64个和18个特征的基因,其中唯一共同的基因是 。根据ML模型的预测, 转录本在PAC中显著减少,并随着肿瘤进展进一步降低,表明其作为PAC诊断和预后标志物的潜力。使用PAC患者血浆样本对SCG5进行外部验证证实,与对照组相比,PAC患者的SCG5显著降低。有趣的是,血浆SCG5水平与供体的体重指数和年龄相关,这意味着胰腺来源的SCG5可能系统性地调节能量代谢。此外,使用公开可用的基因型-组织表达数据集进行分析,包括脂肪组织组织学和胰腺表达,进一步验证了胰腺表达与人类皮下脂肪细胞大小之间的关联。然而,在二维培养中,我们未观察到rSCG5对培养的脂肪细胞有任何明确影响。
循环SCG5可能与脂肪减少有关,是一种有前景的PAC诊断生物标志物。