Department of Biostatistics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran.
Chemical Injuries Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
Sci Rep. 2018 Oct 25;8(1):15775. doi: 10.1038/s41598-018-33986-8.
The aim of this project was to identify candidate novel therapeutic targets to facilitate the treatment of COPD using machine-based learning (ML) algorithms and penalized regression models. In this study, 59 healthy smokers, 53 healthy non-smokers and 21 COPD smokers (9 GOLD stage I and 12 GOLD stage II) were included (n = 133). 20,097 probes were generated from a small airway epithelium (SAE) microarray dataset obtained from these subjects previously. Subsequently, the association between gene expression levels and smoking and COPD, respectively, was assessed using: AdaBoost Classification Trees, Decision Tree, Gradient Boosting Machines, Naive Bayes, Neural Network, Random Forest, Support Vector Machine and adaptive LASSO, Elastic-Net, and Ridge logistic regression analyses. Using this methodology, we identified 44 candidate genes, 27 of these genes had been previously been reported as important factors in the pathogenesis of COPD or regulation of lung function. Here, we also identified 17 genes, which have not been previously identified to be associated with the pathogenesis of COPD or the regulation of lung function. The most significantly regulated of these genes included: PRKAR2B, GAD1, LINC00930 and SLITRK6. These novel genes may provide the basis for the future development of novel therapeutics in COPD and its associated morbidities.
本项目旨在利用基于机器的学习 (ML) 算法和惩罚回归模型,确定治疗 COPD 的新的候选治疗靶点。在这项研究中,纳入了 59 名健康吸烟者、53 名健康不吸烟者和 21 名 COPD 吸烟者(9 名 GOLD 分期 I 和 12 名 GOLD 分期 II)(n=133)。从之前从这些受试者中获得的小气道上皮 (SAE) 微阵列数据集生成了 20097 个探针。随后,使用以下方法评估基因表达水平与吸烟和 COPD 之间的关联:AdaBoost 分类树、决策树、梯度提升机、朴素贝叶斯、神经网络、随机森林、支持向量机和自适应 LASSO、弹性网络和 Ridge 逻辑回归分析。使用这种方法,我们确定了 44 个候选基因,其中 27 个基因以前被报道为 COPD 发病机制或肺功能调节的重要因素。在这里,我们还确定了 17 个以前未被确定与 COPD 发病机制或肺功能调节相关的基因。这些基因中最显著的调节基因包括:PRKAR2B、GAD1、LINC00930 和 SLITRK6。这些新基因可能为 COPD 及其相关疾病的新型治疗药物的未来开发提供基础。