Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA.
Russell/Engleman Rheumatology Research Center, Department of Medicine, University of California San Francisco, San Francisco, CA, USA.
Commun Biol. 2021 Apr 21;4(1):488. doi: 10.1038/s42003-021-02000-9.
Systemic lupus erythematosus (SLE) is an autoimmune disease in which outcomes vary among different racial groups. We leverage cell-sorted RNA-seq data (CD14+ monocytes, B cells, CD4+ T cells, and NK cells) from 120 SLE patients (63 Asian and 57 White individuals) and apply a four-tier approach including unsupervised clustering, differential expression analyses, gene co-expression analyses, and machine learning to identify SLE subgroups within this multiethnic cohort. K-means clustering on each cell-type resulted in three clusters for CD4 and CD14, and two for B and NK cells. To understand the identified clusters, correlation analysis revealed significant positive associations between the clusters and clinical parameters including disease activity as well as ethnicity. We then explored differentially expressed genes between Asian and White groups for each cell-type. The shared differentially expressed genes across cells were involved in SLE or other autoimmune-related pathways. Co-expression analysis identified similarly regulated genes across samples and grouped these genes into modules. Finally, random forest classification of disease activity in the White and Asian cohorts showed the best classification in CD4+ T cells in White individuals. The results from these analyses will help stratify patients based on their gene expression signatures to enable SLE precision medicine.
系统性红斑狼疮(SLE)是一种自身免疫性疾病,不同种族群体的预后存在差异。我们利用来自 120 名 SLE 患者(63 名亚洲人和 57 名白人)的细胞分选 RNA-seq 数据(CD14+单核细胞、B 细胞、CD4+T 细胞和 NK 细胞),并采用包括无监督聚类、差异表达分析、基因共表达分析和机器学习在内的四层方法,在这个多民族队列中识别 SLE 亚群。对每种细胞类型进行 K-means 聚类,得到 CD4 和 CD14 的三个聚类,B 和 NK 细胞的两个聚类。为了理解鉴定出的聚类,相关性分析显示聚类与临床参数(包括疾病活动度和种族)之间存在显著的正相关。然后,我们探索了每个细胞类型中亚洲人和白人之间差异表达的基因。跨细胞共享的差异表达基因参与了 SLE 或其他自身免疫相关途径。共表达分析确定了样本中相似调控的基因,并将这些基因分为模块。最后,对白人队列和亚洲队列中疾病活动的随机森林分类显示,在白人个体的 CD4+T 细胞中具有最佳的分类效果。这些分析的结果将有助于根据患者的基因表达特征对其进行分层,从而实现 SLE 的精准医学。