Department of Medical Sciences, Molecular Medicine and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
Department of Medical Sciences, Rheumatology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
Sci Rep. 2017 Jul 24;7(1):6236. doi: 10.1038/s41598-017-06516-1.
Genome-wide association studies have identified risk loci for SLE, but a large proportion of the genetic contribution to SLE still remains unexplained. To detect novel risk genes, and to predict an individual's SLE risk we designed a random forest classifier using SNP genotype data generated on the "Immunochip" from 1,160 patients with SLE and 2,711 controls. Using gene importance scores defined by the random forest classifier, we identified 15 potential novel risk genes for SLE. Of them 12 are associated with other autoimmune diseases than SLE, whereas three genes (ZNF804A, CDK1, and MANF) have not previously been associated with autoimmunity. Random forest classification also allowed prediction of patients at risk for lupus nephritis with an area under the curve of 0.94. By allele-specific gene expression analysis we detected cis-regulatory SNPs that affect the expression levels of six of the top 40 genes designed by the random forest analysis, indicating a regulatory role for the identified risk variants. The 40 top genes from the prediction were overrepresented for differential expression in B and T cells according to RNA-sequencing of samples from five healthy donors, with more frequent over-expression in B cells compared to T cells.
全基因组关联研究已经确定了 SLE 的风险位点,但仍有很大一部分遗传因素尚未得到解释。为了检测新的风险基因,并预测个体患 SLE 的风险,我们使用基于“免疫芯片”的 SNP 基因型数据,设计了一个随机森林分类器,该数据来自 1160 名 SLE 患者和 2711 名对照。使用随机森林分类器定义的基因重要性评分,我们确定了 15 个 SLE 的潜在新风险基因。其中 12 个与 SLE 以外的其他自身免疫性疾病有关,而另外三个基因(ZNF804A、CDK1 和 MANF)以前与自身免疫无关。随机森林分类也可以预测狼疮肾炎患者的风险,曲线下面积为 0.94。通过等位基因特异性基因表达分析,我们检测到影响随机森林分析设计的前 40 个基因中 6 个基因表达水平的顺式调控 SNP,表明所鉴定的风险变异具有调节作用。根据来自五个健康供体的样本的 RNA-seq 分析,预测的 40 个主要基因在 B 和 T 细胞中的差异表达中被过度表达,与 T 细胞相比,B 细胞的过度表达更为频繁。