Yang Wenjiu, Han Jing, Ma Jinfeng, Feng Yujie, Hou Qingxian, Wang Zhijie, Yu Tengbo
Department of Spine Surgery, The Affiliated Hospital of Qingdao University, Qingdao, Shandong 266071, P.R. China.
Department of Ophthalmology, The Affiliated Hospital of Qingdao University, Qingdao, Shandong 266071, P.R. China.
Exp Ther Med. 2019 Apr;17(4):2561-2566. doi: 10.3892/etm.2019.7216. Epub 2019 Jan 29.
Guilt by association (GBA) algorithm has been widely used to predict gene functions statistically, and a network-based approach may increase the confidence and veracity of identifying molecular signatures for diseases. The aim of the present study was to suggest a gene ontology (GO)-based method by integrating the GBA algorithm and network, to identify key gene functions for spinal muscular atrophy (SMA). The inference of predicting key gene functions was comprised of four steps, preparing gene lists and sets; extracting differentially expressed genes (DEGs) using microarray data [linear models for microarray data (limma)] package; constructing a co-expression matrix on gene lists using the Spearman correlation coefficient method; and predicting gene functions by GBA algorithm. Ultimately, key gene functions were predicted according to the area under the curve (AUC) index for GO terms and the GO terms with AUC >0.7 were determined as the optimal gene functions for SMA. A total of 484 DEGs and 466 background GO terms were regarded as gene lists and sets for the subsequent analyses, respectively. The predicted results obtained from the network-based GBA approach showed 141 gene sets had a good classified performance with AUC >0.5. Most significantly, 3 gene sets with AUC >0.7 were denoted as seed gene functions for SMA, including cell morphogenesis, which is involved in differentiation and ossification. In conclusion, we have predicted 3 key gene functions for SMA compared with control utilizing network-based GBA algorithm. The findings may provide great insights to reveal pathological and molecular mechanism underlying SMA.
关联有罪(GBA)算法已被广泛用于从统计学角度预测基因功能,基于网络的方法可能会提高识别疾病分子特征的可信度和准确性。本研究的目的是提出一种基于基因本体(GO)的方法,通过整合GBA算法和网络,来识别脊髓性肌萎缩症(SMA)的关键基因功能。预测关键基因功能的推理过程包括四个步骤:准备基因列表和集合;使用微阵列数据[微阵列数据的线性模型(limma)]软件包提取差异表达基因(DEG);使用斯皮尔曼相关系数法在基因列表上构建共表达矩阵;以及通过GBA算法预测基因功能。最终,根据GO术语的曲线下面积(AUC)指数预测关键基因功能,AUC>0.7的GO术语被确定为SMA的最佳基因功能。分别将总共484个DEG和466个背景GO术语视为后续分析的基因列表和集合。基于网络的GBA方法获得的预测结果显示,141个基因集具有良好的分类性能,AUC>0.5。最显著的是,3个AUC>0.7的基因集被指定为SMA的种子基因功能,包括参与分化和骨化的细胞形态发生。总之,我们利用基于网络的GBA算法预测了与对照相比SMA的3个关键基因功能。这些发现可能为揭示SMA潜在的病理和分子机制提供重要见解。