Ph.D. Program in Genetics, Bioinformatics and Computational Biology, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States.
School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States.
Front Immunol. 2021 Feb 23;12:627036. doi: 10.3389/fimmu.2021.627036. eCollection 2021.
Subclinical doses of LPS (SD-LPS) are known to cause low-grade inflammatory activation of monocytes, which could lead to inflammatory diseases including atherosclerosis and metabolic syndrome. Sodium 4-phenylbutyrate is a potential therapeutic compound which can reduce the inflammation caused by SD-LPS. To understand the gene regulatory networks of these processes, we have generated scRNA-seq data from mouse monocytes treated with these compounds and identified 11 novel cell clusters. We have developed a machine learning method to integrate scRNA-seq, ATAC-seq, and binding motifs to characterize gene regulatory networks underlying these cell clusters. Using guided regularized random forest and feature selection, our method achieved high performance and outperformed a traditional enrichment-based method in selecting candidate regulatory genes. Our method is particularly efficient in selecting a few candidate genes to explain observed expression pattern. In particular, among 531 candidate TFs, our method achieves an auROC of 0.961 with only 10 motifs. Finally, we found two novel subpopulations of monocyte cells in response to SD-LPS and we confirmed our analysis using independent flow cytometry experiments. Our results suggest that our new machine learning method can select candidate regulatory genes as potential targets for developing new therapeutics against low grade inflammation.
亚临床剂量的 LPS(SD-LPS)已知会引起单核细胞的低度炎症激活,这可能导致包括动脉粥样硬化和代谢综合征在内的炎症性疾病。4-苯基丁酸钠是一种有潜力的治疗化合物,可以减轻 SD-LPS 引起的炎症。为了了解这些过程的基因调控网络,我们从用这些化合物处理的小鼠单核细胞中生成了 scRNA-seq 数据,并鉴定出 11 个新的细胞簇。我们开发了一种机器学习方法,将 scRNA-seq、ATAC-seq 和结合基序整合在一起,以表征这些细胞簇的基因调控网络。使用引导正则随机森林和特征选择,我们的方法在选择候选调控基因方面表现出了很高的性能,优于传统的基于富集的方法。我们的方法在选择少数候选基因来解释观察到的表达模式方面特别有效。特别是,在 531 个候选 TF 中,我们的方法仅使用 10 个基序就实现了 0.961 的 auROC。最后,我们发现了两种新的单核细胞亚群对 SD-LPS 的反应,我们使用独立的流式细胞术实验证实了我们的分析。我们的结果表明,我们的新机器学习方法可以选择候选调控基因作为开发针对低度炎症的新治疗方法的潜在靶点。