Zhang Hang, Xie Ziyang, Yang Yuwen, Zhao Yizhen, Zhang Bao, Fang Jing
School of Mechanical Engineering, Xi'an Jiao Tong University, State Key Laboratory of Manufacturing System Engineering, Xi'an 710049, China.
College of Medicine & Forensic, Health Science Center, Xi'an Jiaotong University, Xi'an 710061, China.
Biomed Res Int. 2017;2017:7860506. doi: 10.1155/2017/7860506. Epub 2017 Feb 9.
Microarray analysis of gene expression is often used to diagnose different types of disease. Many studies report remarkable achievements in nervous system disease. Clinical diagnosis of schizophrenia (SCZ) still depends on doctors' experience, which is unreliable and needs to be more objective and quantified. To solve this problem, we collected whole blood gene expression data from four studies, including 152 individuals with schizophrenia (SCZ) and 138 normal controls in different regions. The correlation-based feature selection (CFS, one of the machine learning methods) algorithm was applied in this study, and 103 significantly differentially expressed genes between patients and controls, called "feature genes," were selected; then, a model for SCZ diagnosis was built. The samples were subdivided into 10 groups, and cross-validation showed that the model we constructed achieved nearly 100% classification accuracy. Mathematical evaluation of the datasets before and after data processing proved the effectiveness of our algorithm. Feature genes were enriched in Parkinson's disease, oxidative phosphorylation, and TGF-beta signaling pathways, which were previously reported to be associated with SCZ. These results suggest that the analysis of gene expression in whole blood by our model could be a useful tool for diagnosing SCZ.
基因表达的微阵列分析常用于诊断不同类型的疾病。许多研究报告了在神经系统疾病方面取得的显著成果。精神分裂症(SCZ)的临床诊断仍依赖于医生的经验,这是不可靠的,需要更客观和量化。为了解决这个问题,我们从四项研究中收集了全血基因表达数据,其中包括来自不同地区的152名精神分裂症患者(SCZ)和138名正常对照。本研究应用了基于相关性的特征选择(CFS,机器学习方法之一)算法,选择了患者和对照之间103个显著差异表达的基因,称为“特征基因”;然后,建立了一个SCZ诊断模型。样本被细分为10组,交叉验证表明我们构建的模型实现了近100%的分类准确率。对数据处理前后数据集的数学评估证明了我们算法的有效性。特征基因在帕金森病、氧化磷酸化和TGF-β信号通路中富集,这些通路先前已被报道与SCZ相关。这些结果表明,我们的模型对全血基因表达的分析可能是诊断SCZ的有用工具。