Li Chunyang, Yu Haopeng, Sun Yajing, Zeng Xiaoxi, Zhang Wei
West China Biomedical Big Data Center, West China Hospital, Sichuan University, Cheng, China.
Medical Big Data Center, Sichuan University, Chengdu, China.
PeerJ. 2021 Mar 5;9:e10682. doi: 10.7717/peerj.10682. eCollection 2021.
Gastric cancer is one of the most lethal tumors and is characterized by poor prognosis and lack of effective diagnostic or therapeutic biomarkers. The aim of this study was to find hub genes serving as biomarkers in gastric cancer diagnosis and therapy.
GSE66229 from Gene Expression Omnibus (GEO) was used as training set. Genes bearing the top 25% standard deviations among all the samples in training set were performed to systematic weighted gene co-expression network analysis (WGCNA) to find candidate genes. Then, hub genes were further screened by using the "least absolute shrinkage and selection operator" (LASSO) logistic regression. Finally, hub genes were validated in the GSE54129 dataset from GEO by supervised learning method artificial neural network (ANN) algorithm.
Twelve modules with strong preservation were identified by using WGCNA methods in training set. Of which, five modules significantly related to gastric cancer were selected as clinically significant modules, and 713 candidate genes were identified from these five modules. Then, , , , , , , , , , and were screened as the hub genes. These hub genes successfully differentiated the tumor samples from the healthy tissues in an independent testing set through artificial neural network algorithm with the area under the receiver operating characteristic curve at 0.946.
These hub genes bearing diagnostic and therapeutic values, and our results may provide a novel prospect for the diagnosis and treatment of gastric cancer in the future.
胃癌是最致命的肿瘤之一,其特点是预后差且缺乏有效的诊断或治疗生物标志物。本研究的目的是寻找在胃癌诊断和治疗中作为生物标志物的关键基因。
来自基因表达综合数据库(GEO)的GSE66229用作训练集。对训练集中所有样本中标准差处于前25%的基因进行系统加权基因共表达网络分析(WGCNA)以寻找候选基因。然后,使用“最小绝对收缩和选择算子”(LASSO)逻辑回归进一步筛选关键基因。最后,通过监督学习方法人工神经网络(ANN)算法在来自GEO的GSE54129数据集中验证关键基因。
在训练集中使用WGCNA方法鉴定出12个具有强保守性的模块。其中,选择了与胃癌显著相关的5个模块作为具有临床意义的模块,并从这5个模块中鉴定出713个候选基因。然后,筛选出 、 、 、 、 、 、 、 、 、 和 作为关键基因。这些关键基因通过人工神经网络算法在独立测试集中成功地将肿瘤样本与健康组织区分开来,受试者工作特征曲线下面积为0.946。
这些关键基因具有诊断和治疗价值,我们的结果可能为未来胃癌的诊断和治疗提供新的前景。