Department of Electrical and Computer Engineering, McGill University, Montreal, QC, Canada.
Mila, Quebec AI Institute, Montreal, QC, Canada.
Nat Commun. 2024 May 14;15(1):4055. doi: 10.1038/s41467-024-48516-6.
We introduce GRouNdGAN, a gene regulatory network (GRN)-guided reference-based causal implicit generative model for simulating single-cell RNA-seq data, in silico perturbation experiments, and benchmarking GRN inference methods. Through the imposition of a user-defined GRN in its architecture, GRouNdGAN simulates steady-state and transient-state single-cell datasets where genes are causally expressed under the control of their regulating transcription factors (TFs). Training on six experimental reference datasets, we show that our model captures non-linear TF-gene dependencies and preserves gene identities, cell trajectories, pseudo-time ordering, and technical and biological noise, with no user manipulation and only implicit parameterization. GRouNdGAN can synthesize cells under new conditions to perform in silico TF knockout experiments. Benchmarking various GRN inference algorithms reveals that GRouNdGAN effectively bridges the existing gap between simulated and biological data benchmarks of GRN inference algorithms, providing gold standard ground truth GRNs and realistic cells corresponding to the biological system of interest.
我们引入了 GRouNdGAN,这是一种基于基因调控网络(GRN)的参考引导的因果隐式生成模型,用于模拟单细胞 RNA-seq 数据、虚拟扰动实验和基准化 GRN 推断方法。通过在其架构中引入用户定义的 GRN,GRouNdGAN 模拟了稳态和瞬态单细胞数据集,其中基因在其调节转录因子(TF)的控制下具有因果表达。在六个实验参考数据集上进行训练,我们表明我们的模型能够捕获非线性 TF-基因依赖性,并保留基因身份、细胞轨迹、伪时间排序以及技术和生物学噪声,而无需用户干预且仅进行隐式参数化。GRouNdGAN 可以在新条件下合成细胞,以执行虚拟 TF 敲除实验。对各种 GRN 推断算法进行基准测试表明,GRouNdGAN 有效地弥合了 GRN 推断算法的模拟和生物数据基准之间的现有差距,为感兴趣的生物系统提供了黄金标准的真实 GRN 和真实细胞。