Chang Hongze, Yang Xiaolong, You Kemin, Jiang Mingwei, Cai Feng, Zhang Yan, Liu Liang, Liu Hui, Liu Xiaodong
Department of orthopedics, Shanghai Yangpu Hospital Affiliated to Tongji University, Shanghai, China.
PeerJ. 2020 Oct 13;8:e10120. doi: 10.7717/peerj.10120. eCollection 2020.
Intervertebral disc degeneration (IDD), a major cause of lower back pain, has multiple contributing factors including genetics, environment, age, and loading history. Bioinformatics analysis has been extensively used to identify diagnostic biomarkers and therapeutic targets for IDD diagnosis and treatment. However, multiple microarray dataset analysis and machine learning methods have not been integrated. In this study, we downloaded the mRNA, microRNA (miRNA), long noncoding RNA (lncRNA), and circular RNA (circRNA) expression profiles (GSE34095, GSE15227, GSE63492 GSE116726, GSE56081 and GSE67566) associated with IDD from the GEO database. Using differential expression analysis and recursive feature elimination, we extracted four optimal feature genes. We then used the support vector machine (SVM) to make a classification model with the four optimal feature genes. The ROC curve was used to evaluate the model's performance, and the expression profiles (GSE63492, GSE116726, GSE56081, and GSE67566) were used to construct a competitive endogenous RNA (ceRNA) regulatory network and explore the underlying mechanisms of the feature genes. We found that three miRNAs (hsa-miR-4728-5p, hsa-miR-5196-5p, and hsa-miR-185-5p) and three circRNAs (hsa_circRNA_100723, hsa_circRNA_104471, and hsa_circRNA_100750) were important regulators with more interactions than the other RNAs across the whole network. The expression level analysis of the three datasets revealed that BCAS4 and SCRG1 were key genes involved in IDD development. Ultimately, our study proposes a novel approach to determining reliable and effective targets in IDD diagnosis and treatment.
椎间盘退变(IDD)是下腰痛的主要原因,有多种促成因素,包括遗传、环境、年龄和负荷史。生物信息学分析已被广泛用于识别IDD诊断和治疗的诊断生物标志物和治疗靶点。然而,多种微阵列数据集分析和机器学习方法尚未整合。在本研究中,我们从GEO数据库下载了与IDD相关的mRNA、微小RNA(miRNA)、长链非编码RNA(lncRNA)和环状RNA(circRNA)表达谱(GSE34095、GSE15227、GSE63492、GSE116726、GSE56081和GSE67566)。通过差异表达分析和递归特征消除,我们提取了四个最佳特征基因。然后,我们使用支持向量机(SVM)用这四个最佳特征基因构建了一个分类模型。ROC曲线用于评估模型性能,表达谱(GSE63492、GSE116726、GSE56081和GSE67566)用于构建竞争性内源性RNA(ceRNA)调控网络并探索特征基因的潜在机制。我们发现三个miRNA(hsa-miR-4728-5p、hsa-miR-5196-5p和hsa-miR-185-5p)和三个circRNA(hsa_circRNA_100723、hsa_circRNA_104471和hsa_circRNA_100750)是重要的调控因子,在整个网络中比其他RNA具有更多的相互作用。对三个数据集的表达水平分析表明,BCAS4和SCRG1是参与IDD发展的关键基因。最终,我们的研究提出了一种在IDD诊断和治疗中确定可靠有效靶点的新方法。