Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute of Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Cancer Center at Illinois, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
Cell Syst. 2020 Sep 23;11(3):252-271.e11. doi: 10.1016/j.cels.2020.08.003. Epub 2020 Aug 31.
A common approach to benchmarking of single-cell transcriptomics tools is to generate synthetic datasets that statistically resemble experimental data. However, most existing single-cell simulators do not incorporate transcription factor-gene regulatory interactions that underlie expression dynamics. Here, we present SERGIO, a simulator of single-cell gene expression data that models the stochastic nature of transcription as well as regulation of genes by multiple transcription factors according to a user-provided gene regulatory network. SERGIO can simulate any number of cell types in steady state or cells differentiating to multiple fates. We show that datasets generated by SERGIO are statistically comparable to experimental data generated by Illumina HiSeq2000, Drop-seq, Illumina 10X chromium, and Smart-seq. We use SERGIO to benchmark several single-cell analysis tools, including GRN inference methods, and identify Tcf7, Gata3, and Bcl11b as key drivers of T cell differentiation by performing in silico knockout experiments. SERGIO is freely available for download here: https://github.com/PayamDiba/SERGIO.
一种常用于单细胞转录组学工具基准测试的方法是生成统计上类似于实验数据的合成数据集。然而,大多数现有的单细胞模拟器并未纳入构成表达动态基础的转录因子-基因调控相互作用。在这里,我们提出了 SERGIO,这是一种单细胞基因表达数据的模拟器,它根据用户提供的基因调控网络,对转录的随机性以及多个转录因子对基因的调控进行建模。SERGIO 可以模拟稳态或分化为多种命运的任意数量的细胞类型。我们表明,SERGIO 生成的数据集在统计学上可与 Illumina HiSeq2000、Drop-seq、Illumina 10X chromium 和 Smart-seq 生成的实验数据相媲美。我们使用 SERGIO 对几种单细胞分析工具进行了基准测试,包括 GRN 推断方法,并通过进行计算机模拟敲除实验,确定 Tcf7、Gata3 和 Bcl11b 是 T 细胞分化的关键驱动因素。SERGIO 可在此处免费下载:https://github.com/PayamDiba/SERGIO。